* IO scheduler based IO Controller V2 @ 2009-05-05 19:58 Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-05 19:58 UTC (permalink / raw) To: nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando Cc: akpm, vgoyal Hi All, Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. First version of the patches was posted here. http://lkml.org/lkml/2009/3/11/486 This patchset is still work in progress but I want to keep on getting the snapshot of my tree out at regular intervals to get the feedback hence V2. Before I go into details of what are the major changes from V1, wanted to highlight other IO controller proposals on lkml. Other active IO controller proposals ------------------------------------ Currently primarily two other IO controller proposals are out there. dm-ioband --------- This patch set is from Ryo Tsuruta from valinux. It is a proportional bandwidth controller implemented as a dm driver. http://people.valinux.co.jp/~ryov/dm-ioband/ The biggest issue (apart from others), with a 2nd level IO controller is that buffering of BIOs takes place in a single queue and dispatch of this BIOs to unerlying IO scheduler is in FIFO manner. That means whenever the buffering takes place, it breaks the notion of different class and priority of CFQ. That means RT requests might be stuck behind some write requests or some read requests might be stuck behind somet write requests for long time etc. To demonstrate the single FIFO dispatch issues, I had run some basic tests and posted the results in following mail thread. http://lkml.org/lkml/2009/4/13/2 These are hard to solve issues and one will end up maintaining the separate queues for separate classes and priority as CFQ does to fully resolve it. But that will make 2nd level implementation complex at the same time if somebody is trying to use IO controller on a single disk or on a hardware RAID using cfq as scheduler, it will be two layers of queueing maintating separate queues per priorty level. One at dm-driver level and other at CFQ which again does not make lot of sense. On the other hand, if a user is running noop at the device level, at higher level we will be maintaining multiple cfq like queues, which also does not make sense as underlying IO scheduler never wanted that. Hence, IMHO, I think that controlling bio at second level probably is not a very good idea. We should instead do it at IO scheduler level where we already maintain all the needed queues. Just that make the scheduling hierarhical and group aware so isolate IO of one group from other. IO-throttling ------------- This patch set is from Andrea Righi provides max bandwidth controller. That means, it does not gurantee the minimum bandwidth. It provides the maximum bandwidth limits and throttles the application if it crosses its bandwidth. So its not apple vs apple comparison. This patch set and dm-ioband provide proportional bandwidth control where a cgroup can use much more bandwidth if there are not other users and resource control comes into the picture only if there is contention. It seems that there are both the kind of users there. One set of people needing proportional BW control and other people needing max bandwidth control. Now the question is, where max bandwidth control should be implemented? At higher layers or at IO scheduler level? Should proportional bw control and max bw control be implemented separately at different layer or these should be implemented at one place? IMHO, if we are doing proportional bw control at IO scheduler layer, it should be possible to extend it to do max bw control also here without lot of effort. Then it probably does not make too much of sense to do two types of control at two different layers. Doing it at one place should lead to lesser code and reduced complexity. Secondly, io-throttling solution also buffers writes at higher layer. Which again will lead to issue of losing the notion of priority of writes. Hence, personally I think that users will need both proportional bw as well as max bw control and we probably should implement these at a single place instead of splitting it. Once elevator based io controller patchset matures, it can be enhanced to do max bw control also. Having said that, one issue with doing upper limit control at elevator/IO scheduler level is that it does not have the view of higher level logical devices. So if there is a software RAID with two disks, then one can not do max bw control on logical device, instead it shall have to be on leaf node where io scheduler is attached. Now back to the desciption of this patchset and changes from V1. - Rebased patches to 2.6.30-rc4. - Last time Andrew mentioned that async writes are big issue for us hence, introduced the control for async writes also. - Implemented per group request descriptor support. This was needed to make sure one group doing lot of IO does not starve other group of request descriptors and other group does not get fair share. This is a basic patch right now which probably will require more changes after some discussion. - Exported the disk time used and number of sectors dispatched by a cgroup through cgroup interface. This should help us in seeing how much disk time each group got and whether it is fair or not. - Implemented group refcounting support. Lack of this was causing some cgroup related issues. There are still some races left out which needs to be fixed. - For IO tracking/async write tracking, started making use of patches of blkio-cgroup from ryo Tsuruta posted here. http://lkml.org/lkml/2009/4/28/235 Currently people seem to be liking the idea of separate subsystem for tracking writes and then rest of the users can use that info instead of everybody implementing their own. That's a different thing that how many users are out there which will end up in kernel is not clear. So instead of carrying own versin of bio-cgroup patches, and overloading io controller cgroup subsystem, I am making use of blkio-cgroup patches. One shall have to mount io controller and blkio subsystem together on the same hiearchy for the time being. Later we can take care of the case where blkio is mounted on a different hierarchy. - Replaced group priorities with group weights. Testing ======= Again, I have been able to do only very basic testing of reads and writes. Did not want to hold the patches back because of testing. Providing support for async writes took much more time than expected and still work is left in that area. Will continue to do more testing. Test1 (Fairness for synchronous reads) ====================================== - Two dd in two cgroups with cgrop weights 1000 and 500. Ran two "dd" in those cgroups (With CFQ scheduler and /sys/block/<device>/queue/fairness = 1) dd if=/mnt/$BLOCKDEV/zerofile1 of=/dev/null & dd if=/mnt/$BLOCKDEV/zerofile2 of=/dev/null & 234179072 bytes (234 MB) copied, 4.13954 s, 56.6 MB/s 234179072 bytes (234 MB) copied, 5.2127 s, 44.9 MB/s group1 time=3108 group1 sectors=460968 group2 time=1405 group2 sectors=264944 This patchset tries to provide fairness in terms of disk time received. group1 got almost double of group2 disk time (At the time of first dd finish). These time and sectors statistics can be read using io.disk_time and io.disk_sector files in cgroup. More about it in documentation file. Test2 (Fairness for async writes) ================================= Fairness for async writes is tricy and biggest reason is that async writes are cached in higher layers (page cahe) and are dispatched to lower layers not necessarily in proportional manner. For example, consider two dd threads reading /dev/zero as input file and doing writes of huge files. Very soon we will cross vm_dirty_ratio and dd thread will be forced to write out some pages to disk before more pages can be dirtied. But not necessarily dirty pages of same thread are picked. It can very well pick the inode of lesser priority dd thread and do some writeout. So effectively higher weight dd is doing writeouts of lower weight dd pages and we don't see service differentation IOW, the core problem with async write fairness is that higher weight thread does not throw enought IO traffic at IO controller to keep the queue continuously backlogged. This are many .2 to .8 second intervals where higher weight queue is empty and in that duration lower weight queue get lots of job done giving the impression that there was no service differentiation. In summary, from IO controller point of view async writes support is there. Now we need to do some more work in higher layers to make sure higher weight process is not blocked behind IO of some lower weight process. This is a TODO item. So to test async writes I generated lots of write traffic in two cgroups (50 fio threads) and watched the disk time statistics in respective cgroups at the interval of 2 seconds. Thanks to ryo tsuruta for the test case. ***************************************************************** sync echo 3 > /proc/sys/vm/drop_caches fio_args="--size=64m --rw=write --numjobs=50 --group_reporting" echo $$ > /cgroup/bfqio/test1/tasks fio $fio_args --name=test1 --directory=/mnt/sdd1/fio/ --output=/mnt/sdd1/fio/test1.log & echo $$ > /cgroup/bfqio/test2/tasks fio $fio_args --name=test2 --directory=/mnt/sdd2/fio/ --output=/mnt/sdd2/fio/test2.log & *********************************************************************** And watched the disk time and sector statistics for the both the cgroups every 2 seconds using a script. How is snippet from output. test1 statistics: time=9848 sectors=643152 test2 statistics: time=5224 sectors=258600 test1 statistics: time=11736 sectors=785792 test2 statistics: time=6509 sectors=333160 test1 statistics: time=13607 sectors=943968 test2 statistics: time=7443 sectors=394352 test1 statistics: time=15662 sectors=1089496 test2 statistics: time=8568 sectors=451152 So disk time consumed by group1 is almost double of group2. Your feedback and comments are welcome. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* IO scheduler based IO Controller V2 @ 2009-05-05 19:58 Vivek Goyal [not found] ` <1241553525-28095-1-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-06 8:11 ` Gui Jianfeng 0 siblings, 2 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-05 19:58 UTC (permalink / raw) To: nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda Cc: vgoyal, akpm Hi All, Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. First version of the patches was posted here. http://lkml.org/lkml/2009/3/11/486 This patchset is still work in progress but I want to keep on getting the snapshot of my tree out at regular intervals to get the feedback hence V2. Before I go into details of what are the major changes from V1, wanted to highlight other IO controller proposals on lkml. Other active IO controller proposals ------------------------------------ Currently primarily two other IO controller proposals are out there. dm-ioband --------- This patch set is from Ryo Tsuruta from valinux. It is a proportional bandwidth controller implemented as a dm driver. http://people.valinux.co.jp/~ryov/dm-ioband/ The biggest issue (apart from others), with a 2nd level IO controller is that buffering of BIOs takes place in a single queue and dispatch of this BIOs to unerlying IO scheduler is in FIFO manner. That means whenever the buffering takes place, it breaks the notion of different class and priority of CFQ. That means RT requests might be stuck behind some write requests or some read requests might be stuck behind somet write requests for long time etc. To demonstrate the single FIFO dispatch issues, I had run some basic tests and posted the results in following mail thread. http://lkml.org/lkml/2009/4/13/2 These are hard to solve issues and one will end up maintaining the separate queues for separate classes and priority as CFQ does to fully resolve it. But that will make 2nd level implementation complex at the same time if somebody is trying to use IO controller on a single disk or on a hardware RAID using cfq as scheduler, it will be two layers of queueing maintating separate queues per priorty level. One at dm-driver level and other at CFQ which again does not make lot of sense. On the other hand, if a user is running noop at the device level, at higher level we will be maintaining multiple cfq like queues, which also does not make sense as underlying IO scheduler never wanted that. Hence, IMHO, I think that controlling bio at second level probably is not a very good idea. We should instead do it at IO scheduler level where we already maintain all the needed queues. Just that make the scheduling hierarhical and group aware so isolate IO of one group from other. IO-throttling ------------- This patch set is from Andrea Righi provides max bandwidth controller. That means, it does not gurantee the minimum bandwidth. It provides the maximum bandwidth limits and throttles the application if it crosses its bandwidth. So its not apple vs apple comparison. This patch set and dm-ioband provide proportional bandwidth control where a cgroup can use much more bandwidth if there are not other users and resource control comes into the picture only if there is contention. It seems that there are both the kind of users there. One set of people needing proportional BW control and other people needing max bandwidth control. Now the question is, where max bandwidth control should be implemented? At higher layers or at IO scheduler level? Should proportional bw control and max bw control be implemented separately at different layer or these should be implemented at one place? IMHO, if we are doing proportional bw control at IO scheduler layer, it should be possible to extend it to do max bw control also here without lot of effort. Then it probably does not make too much of sense to do two types of control at two different layers. Doing it at one place should lead to lesser code and reduced complexity. Secondly, io-throttling solution also buffers writes at higher layer. Which again will lead to issue of losing the notion of priority of writes. Hence, personally I think that users will need both proportional bw as well as max bw control and we probably should implement these at a single place instead of splitting it. Once elevator based io controller patchset matures, it can be enhanced to do max bw control also. Having said that, one issue with doing upper limit control at elevator/IO scheduler level is that it does not have the view of higher level logical devices. So if there is a software RAID with two disks, then one can not do max bw control on logical device, instead it shall have to be on leaf node where io scheduler is attached. Now back to the desciption of this patchset and changes from V1. - Rebased patches to 2.6.30-rc4. - Last time Andrew mentioned that async writes are big issue for us hence, introduced the control for async writes also. - Implemented per group request descriptor support. This was needed to make sure one group doing lot of IO does not starve other group of request descriptors and other group does not get fair share. This is a basic patch right now which probably will require more changes after some discussion. - Exported the disk time used and number of sectors dispatched by a cgroup through cgroup interface. This should help us in seeing how much disk time each group got and whether it is fair or not. - Implemented group refcounting support. Lack of this was causing some cgroup related issues. There are still some races left out which needs to be fixed. - For IO tracking/async write tracking, started making use of patches of blkio-cgroup from ryo Tsuruta posted here. http://lkml.org/lkml/2009/4/28/235 Currently people seem to be liking the idea of separate subsystem for tracking writes and then rest of the users can use that info instead of everybody implementing their own. That's a different thing that how many users are out there which will end up in kernel is not clear. So instead of carrying own versin of bio-cgroup patches, and overloading io controller cgroup subsystem, I am making use of blkio-cgroup patches. One shall have to mount io controller and blkio subsystem together on the same hiearchy for the time being. Later we can take care of the case where blkio is mounted on a different hierarchy. - Replaced group priorities with group weights. Testing ======= Again, I have been able to do only very basic testing of reads and writes. Did not want to hold the patches back because of testing. Providing support for async writes took much more time than expected and still work is left in that area. Will continue to do more testing. Test1 (Fairness for synchronous reads) ====================================== - Two dd in two cgroups with cgrop weights 1000 and 500. Ran two "dd" in those cgroups (With CFQ scheduler and /sys/block/<device>/queue/fairness = 1) dd if=/mnt/$BLOCKDEV/zerofile1 of=/dev/null & dd if=/mnt/$BLOCKDEV/zerofile2 of=/dev/null & 234179072 bytes (234 MB) copied, 4.13954 s, 56.6 MB/s 234179072 bytes (234 MB) copied, 5.2127 s, 44.9 MB/s group1 time=3108 group1 sectors=460968 group2 time=1405 group2 sectors=264944 This patchset tries to provide fairness in terms of disk time received. group1 got almost double of group2 disk time (At the time of first dd finish). These time and sectors statistics can be read using io.disk_time and io.disk_sector files in cgroup. More about it in documentation file. Test2 (Fairness for async writes) ================================= Fairness for async writes is tricy and biggest reason is that async writes are cached in higher layers (page cahe) and are dispatched to lower layers not necessarily in proportional manner. For example, consider two dd threads reading /dev/zero as input file and doing writes of huge files. Very soon we will cross vm_dirty_ratio and dd thread will be forced to write out some pages to disk before more pages can be dirtied. But not necessarily dirty pages of same thread are picked. It can very well pick the inode of lesser priority dd thread and do some writeout. So effectively higher weight dd is doing writeouts of lower weight dd pages and we don't see service differentation IOW, the core problem with async write fairness is that higher weight thread does not throw enought IO traffic at IO controller to keep the queue continuously backlogged. This are many .2 to .8 second intervals where higher weight queue is empty and in that duration lower weight queue get lots of job done giving the impression that there was no service differentiation. In summary, from IO controller point of view async writes support is there. Now we need to do some more work in higher layers to make sure higher weight process is not blocked behind IO of some lower weight process. This is a TODO item. So to test async writes I generated lots of write traffic in two cgroups (50 fio threads) and watched the disk time statistics in respective cgroups at the interval of 2 seconds. Thanks to ryo tsuruta for the test case. ***************************************************************** sync echo 3 > /proc/sys/vm/drop_caches fio_args="--size=64m --rw=write --numjobs=50 --group_reporting" echo $$ > /cgroup/bfqio/test1/tasks fio $fio_args --name=test1 --directory=/mnt/sdd1/fio/ --output=/mnt/sdd1/fio/test1.log & echo $$ > /cgroup/bfqio/test2/tasks fio $fio_args --name=test2 --directory=/mnt/sdd2/fio/ --output=/mnt/sdd2/fio/test2.log & *********************************************************************** And watched the disk time and sector statistics for the both the cgroups every 2 seconds using a script. How is snippet from output. test1 statistics: time=9848 sectors=643152 test2 statistics: time=5224 sectors=258600 test1 statistics: time=11736 sectors=785792 test2 statistics: time=6509 sectors=333160 test1 statistics: time=13607 sectors=943968 test2 statistics: time=7443 sectors=394352 test1 statistics: time=15662 sectors=1089496 test2 statistics: time=8568 sectors=451152 So disk time consumed by group1 is almost double of group2. Your feedback and comments are welcome. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <1241553525-28095-1-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 2009-05-05 19:58 Vivek Goyal @ 2009-05-05 20:24 ` Andrew Morton 2009-05-06 8:11 ` Gui Jianfeng 1 sibling, 0 replies; 97+ messages in thread From: Andrew Morton @ 2009-05-05 20:24 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w On Tue, 5 May 2009 15:58:27 -0400 Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > Hi All, > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > ... > Currently primarily two other IO controller proposals are out there. > > dm-ioband > --------- > This patch set is from Ryo Tsuruta from valinux. > ... > IO-throttling > ------------- > This patch set is from Andrea Righi provides max bandwidth controller. I'm thinking we need to lock you guys in a room and come back in 15 minutes. Seriously, how are we to resolve this? We could lock me in a room and cmoe back in 15 days, but there's no reason to believe that I'd emerge with the best answer. I tend to think that a cgroup-based controller is the way to go. Anything else will need to be wired up to cgroups _anyway_, and that might end up messy. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-05 20:24 ` Andrew Morton 0 siblings, 0 replies; 97+ messages in thread From: Andrew Morton @ 2009-05-05 20:24 UTC (permalink / raw) To: Vivek Goyal Cc: nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, vgoyal On Tue, 5 May 2009 15:58:27 -0400 Vivek Goyal <vgoyal@redhat.com> wrote: > > Hi All, > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > ... > Currently primarily two other IO controller proposals are out there. > > dm-ioband > --------- > This patch set is from Ryo Tsuruta from valinux. > ... > IO-throttling > ------------- > This patch set is from Andrea Righi provides max bandwidth controller. I'm thinking we need to lock you guys in a room and come back in 15 minutes. Seriously, how are we to resolve this? We could lock me in a room and cmoe back in 15 days, but there's no reason to believe that I'd emerge with the best answer. I tend to think that a cgroup-based controller is the way to go. Anything else will need to be wired up to cgroups _anyway_, and that might end up messy. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-05 20:24 ` Andrew Morton (?) @ 2009-05-05 22:20 ` Peter Zijlstra 2009-05-06 3:42 ` Balbir Singh 2009-05-06 3:42 ` Balbir Singh -1 siblings, 2 replies; 97+ messages in thread From: Peter Zijlstra @ 2009-05-05 22:20 UTC (permalink / raw) To: Andrew Morton Cc: Vivek Goyal, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > On Tue, 5 May 2009 15:58:27 -0400 > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > Hi All, > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > ... > > Currently primarily two other IO controller proposals are out there. > > > > dm-ioband > > --------- > > This patch set is from Ryo Tsuruta from valinux. > > ... > > IO-throttling > > ------------- > > This patch set is from Andrea Righi provides max bandwidth controller. > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > Seriously, how are we to resolve this? We could lock me in a room and > cmoe back in 15 days, but there's no reason to believe that I'd emerge > with the best answer. > > I tend to think that a cgroup-based controller is the way to go. > Anything else will need to be wired up to cgroups _anyway_, and that > might end up messy. FWIW I subscribe to the io-scheduler faith as opposed to the device-mapper cult ;-) Also, I don't think a simple throttle will be very useful, a more mature solution should cater to more use cases. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-05 22:20 ` Peter Zijlstra @ 2009-05-06 3:42 ` Balbir Singh 2009-05-06 3:42 ` Balbir Singh 1 sibling, 0 replies; 97+ messages in thread From: Balbir Singh @ 2009-05-06 3:42 UTC (permalink / raw) To: Peter Zijlstra Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton * Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> [2009-05-06 00:20:49]: > On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > > On Tue, 5 May 2009 15:58:27 -0400 > > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > > > Hi All, > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > ... > > > Currently primarily two other IO controller proposals are out there. > > > > > > dm-ioband > > > --------- > > > This patch set is from Ryo Tsuruta from valinux. > > > ... > > > IO-throttling > > > ------------- > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > Seriously, how are we to resolve this? We could lock me in a room and > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > with the best answer. > > > > I tend to think that a cgroup-based controller is the way to go. > > Anything else will need to be wired up to cgroups _anyway_, and that > > might end up messy. > > FWIW I subscribe to the io-scheduler faith as opposed to the > device-mapper cult ;-) > > Also, I don't think a simple throttle will be very useful, a more mature > solution should cater to more use cases. > I tend to agree, unless Andrea can prove us wrong. I don't think throttling a task (not letting it consume CPU, memory when its IO quota is exceeded) is a good idea. I've asked that question to Andrea a few times, but got no response. -- Balbir ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-05 22:20 ` Peter Zijlstra 2009-05-06 3:42 ` Balbir Singh @ 2009-05-06 3:42 ` Balbir Singh 2009-05-06 10:20 ` Fabio Checconi ` (3 more replies) 1 sibling, 4 replies; 97+ messages in thread From: Balbir Singh @ 2009-05-06 3:42 UTC (permalink / raw) To: Peter Zijlstra Cc: Andrew Morton, Vivek Goyal, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda * Peter Zijlstra <peterz@infradead.org> [2009-05-06 00:20:49]: > On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > > On Tue, 5 May 2009 15:58:27 -0400 > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > Hi All, > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > ... > > > Currently primarily two other IO controller proposals are out there. > > > > > > dm-ioband > > > --------- > > > This patch set is from Ryo Tsuruta from valinux. > > > ... > > > IO-throttling > > > ------------- > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > Seriously, how are we to resolve this? We could lock me in a room and > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > with the best answer. > > > > I tend to think that a cgroup-based controller is the way to go. > > Anything else will need to be wired up to cgroups _anyway_, and that > > might end up messy. > > FWIW I subscribe to the io-scheduler faith as opposed to the > device-mapper cult ;-) > > Also, I don't think a simple throttle will be very useful, a more mature > solution should cater to more use cases. > I tend to agree, unless Andrea can prove us wrong. I don't think throttling a task (not letting it consume CPU, memory when its IO quota is exceeded) is a good idea. I've asked that question to Andrea a few times, but got no response. -- Balbir ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 3:42 ` Balbir Singh @ 2009-05-06 10:20 ` Fabio Checconi 2009-05-06 17:10 ` Balbir Singh [not found] ` <20090506102030.GB20544-f9ZlEuEWxVeACYmtYXMKmw@public.gmane.org> 2009-05-06 18:47 ` Divyesh Shah ` (2 subsequent siblings) 3 siblings, 2 replies; 97+ messages in thread From: Fabio Checconi @ 2009-05-06 10:20 UTC (permalink / raw) To: Balbir Singh Cc: Peter Zijlstra, Andrew Morton, Vivek Goyal, nauman, dpshah, lizf, mikew, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda Hi, > From: Balbir Singh <balbir@linux.vnet.ibm.com> > Date: Wed, May 06, 2009 09:12:54AM +0530 > > * Peter Zijlstra <peterz@infradead.org> [2009-05-06 00:20:49]: > > > On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > > > On Tue, 5 May 2009 15:58:27 -0400 > > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > > > > Hi All, > > > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > > ... > > > > Currently primarily two other IO controller proposals are out there. > > > > > > > > dm-ioband > > > > --------- > > > > This patch set is from Ryo Tsuruta from valinux. > > > > ... > > > > IO-throttling > > > > ------------- > > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > > > Seriously, how are we to resolve this? We could lock me in a room and > > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > > with the best answer. > > > > > > I tend to think that a cgroup-based controller is the way to go. > > > Anything else will need to be wired up to cgroups _anyway_, and that > > > might end up messy. > > > > FWIW I subscribe to the io-scheduler faith as opposed to the > > device-mapper cult ;-) > > > > Also, I don't think a simple throttle will be very useful, a more mature > > solution should cater to more use cases. > > > > I tend to agree, unless Andrea can prove us wrong. I don't think > throttling a task (not letting it consume CPU, memory when its IO > quota is exceeded) is a good idea. I've asked that question to Andrea > a few times, but got no response. > from what I can see, the principle used by io-throttling is not too different to what happens when bandwidth differentiation with synchronous access patterns is achieved using idling at the io scheduler level. When an io scheduler anticipates requests from a task/cgroup, all the other tasks with pending (synchronous) requests are in fact blocked, and the fact that the task being anticipated is allowed to submit additional io while they remain blocked is what creates the bandwidth differentiation among them. Of course there are many differences, in particular related to the latencies introduced by the two mechanisms, the granularity they use to allocate disk service, and to what throttling and proportional share io scheduling can or cannot guarantee, but FWIK both of them rely on blocking tasks to create bandwidth differentiation. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 10:20 ` Fabio Checconi @ 2009-05-06 17:10 ` Balbir Singh [not found] ` <20090506102030.GB20544-f9ZlEuEWxVeACYmtYXMKmw@public.gmane.org> 1 sibling, 0 replies; 97+ messages in thread From: Balbir Singh @ 2009-05-06 17:10 UTC (permalink / raw) To: Fabio Checconi Cc: dhaval, snitzer, dm-devel, jens.axboe, agk, paolo.valente, fernando, jmoyer, righi.andrea, containers, linux-kernel, Andrew Morton * Fabio Checconi <fchecconi@gmail.com> [2009-05-06 12:20:30]: > Hi, > > > From: Balbir Singh <balbir@linux.vnet.ibm.com> > > Date: Wed, May 06, 2009 09:12:54AM +0530 > > > > * Peter Zijlstra <peterz@infradead.org> [2009-05-06 00:20:49]: > > > > > On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > > > > On Tue, 5 May 2009 15:58:27 -0400 > > > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > > > > > > > Hi All, > > > > > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > > > ... > > > > > Currently primarily two other IO controller proposals are out there. > > > > > > > > > > dm-ioband > > > > > --------- > > > > > This patch set is from Ryo Tsuruta from valinux. > > > > > ... > > > > > IO-throttling > > > > > ------------- > > > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > > > > > Seriously, how are we to resolve this? We could lock me in a room and > > > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > > > with the best answer. > > > > > > > > I tend to think that a cgroup-based controller is the way to go. > > > > Anything else will need to be wired up to cgroups _anyway_, and that > > > > might end up messy. > > > > > > FWIW I subscribe to the io-scheduler faith as opposed to the > > > device-mapper cult ;-) > > > > > > Also, I don't think a simple throttle will be very useful, a more mature > > > solution should cater to more use cases. > > > > > > > I tend to agree, unless Andrea can prove us wrong. I don't think > > throttling a task (not letting it consume CPU, memory when its IO > > quota is exceeded) is a good idea. I've asked that question to Andrea > > a few times, but got no response. > > > > from what I can see, the principle used by io-throttling is not too > different to what happens when bandwidth differentiation with synchronous > access patterns is achieved using idling at the io scheduler level. > > When an io scheduler anticipates requests from a task/cgroup, all the > other tasks with pending (synchronous) requests are in fact blocked, and > the fact that the task being anticipated is allowed to submit additional > io while they remain blocked is what creates the bandwidth differentiation > among them. > > Of course there are many differences, in particular related to the > latencies introduced by the two mechanisms, the granularity they use to > allocate disk service, and to what throttling and proportional share io > scheduling can or cannot guarantee, but FWIK both of them rely on > blocking tasks to create bandwidth differentiation. My concern stems from the fact that in the case in this case we might throttle all the tasks in the group.. no? I'll take a closer look. -- Balbir ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-06 17:10 ` Balbir Singh 0 siblings, 0 replies; 97+ messages in thread From: Balbir Singh @ 2009-05-06 17:10 UTC (permalink / raw) To: Fabio Checconi Cc: paolo.valente, dhaval, snitzer, fernando, jmoyer, linux-kernel, dm-devel, jens.axboe, Andrew Morton, containers, agk, righi.andrea * Fabio Checconi <fchecconi@gmail.com> [2009-05-06 12:20:30]: > Hi, > > > From: Balbir Singh <balbir@linux.vnet.ibm.com> > > Date: Wed, May 06, 2009 09:12:54AM +0530 > > > > * Peter Zijlstra <peterz@infradead.org> [2009-05-06 00:20:49]: > > > > > On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > > > > On Tue, 5 May 2009 15:58:27 -0400 > > > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > > > > > > > Hi All, > > > > > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > > > ... > > > > > Currently primarily two other IO controller proposals are out there. > > > > > > > > > > dm-ioband > > > > > --------- > > > > > This patch set is from Ryo Tsuruta from valinux. > > > > > ... > > > > > IO-throttling > > > > > ------------- > > > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > > > > > Seriously, how are we to resolve this? We could lock me in a room and > > > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > > > with the best answer. > > > > > > > > I tend to think that a cgroup-based controller is the way to go. > > > > Anything else will need to be wired up to cgroups _anyway_, and that > > > > might end up messy. > > > > > > FWIW I subscribe to the io-scheduler faith as opposed to the > > > device-mapper cult ;-) > > > > > > Also, I don't think a simple throttle will be very useful, a more mature > > > solution should cater to more use cases. > > > > > > > I tend to agree, unless Andrea can prove us wrong. I don't think > > throttling a task (not letting it consume CPU, memory when its IO > > quota is exceeded) is a good idea. I've asked that question to Andrea > > a few times, but got no response. > > > > from what I can see, the principle used by io-throttling is not too > different to what happens when bandwidth differentiation with synchronous > access patterns is achieved using idling at the io scheduler level. > > When an io scheduler anticipates requests from a task/cgroup, all the > other tasks with pending (synchronous) requests are in fact blocked, and > the fact that the task being anticipated is allowed to submit additional > io while they remain blocked is what creates the bandwidth differentiation > among them. > > Of course there are many differences, in particular related to the > latencies introduced by the two mechanisms, the granularity they use to > allocate disk service, and to what throttling and proportional share io > scheduling can or cannot guarantee, but FWIK both of them rely on > blocking tasks to create bandwidth differentiation. My concern stems from the fact that in the case in this case we might throttle all the tasks in the group.. no? I'll take a closer look. -- Balbir ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090506102030.GB20544-f9ZlEuEWxVeACYmtYXMKmw@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506102030.GB20544-f9ZlEuEWxVeACYmtYXMKmw@public.gmane.org> @ 2009-05-06 17:10 ` Balbir Singh 0 siblings, 0 replies; 97+ messages in thread From: Balbir Singh @ 2009-05-06 17:10 UTC (permalink / raw) To: Fabio Checconi Cc: paolo.valente-rcYM44yAMweonA0d6jMUrA, dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, Andrew Morton, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, agk-H+wXaHxf7aLQT0dZR+AlfA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w * Fabio Checconi <fchecconi-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> [2009-05-06 12:20:30]: > Hi, > > > From: Balbir Singh <balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> > > Date: Wed, May 06, 2009 09:12:54AM +0530 > > > > * Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> [2009-05-06 00:20:49]: > > > > > On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > > > > On Tue, 5 May 2009 15:58:27 -0400 > > > > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > > > > > > > > > Hi All, > > > > > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > > > ... > > > > > Currently primarily two other IO controller proposals are out there. > > > > > > > > > > dm-ioband > > > > > --------- > > > > > This patch set is from Ryo Tsuruta from valinux. > > > > > ... > > > > > IO-throttling > > > > > ------------- > > > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > > > > > Seriously, how are we to resolve this? We could lock me in a room and > > > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > > > with the best answer. > > > > > > > > I tend to think that a cgroup-based controller is the way to go. > > > > Anything else will need to be wired up to cgroups _anyway_, and that > > > > might end up messy. > > > > > > FWIW I subscribe to the io-scheduler faith as opposed to the > > > device-mapper cult ;-) > > > > > > Also, I don't think a simple throttle will be very useful, a more mature > > > solution should cater to more use cases. > > > > > > > I tend to agree, unless Andrea can prove us wrong. I don't think > > throttling a task (not letting it consume CPU, memory when its IO > > quota is exceeded) is a good idea. I've asked that question to Andrea > > a few times, but got no response. > > > > from what I can see, the principle used by io-throttling is not too > different to what happens when bandwidth differentiation with synchronous > access patterns is achieved using idling at the io scheduler level. > > When an io scheduler anticipates requests from a task/cgroup, all the > other tasks with pending (synchronous) requests are in fact blocked, and > the fact that the task being anticipated is allowed to submit additional > io while they remain blocked is what creates the bandwidth differentiation > among them. > > Of course there are many differences, in particular related to the > latencies introduced by the two mechanisms, the granularity they use to > allocate disk service, and to what throttling and proportional share io > scheduling can or cannot guarantee, but FWIK both of them rely on > blocking tasks to create bandwidth differentiation. My concern stems from the fact that in the case in this case we might throttle all the tasks in the group.. no? I'll take a closer look. -- Balbir ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 3:42 ` Balbir Singh 2009-05-06 10:20 ` Fabio Checconi @ 2009-05-06 18:47 ` Divyesh Shah [not found] ` <20090506034254.GD4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> 2009-05-06 20:42 ` Andrea Righi 3 siblings, 0 replies; 97+ messages in thread From: Divyesh Shah @ 2009-05-06 18:47 UTC (permalink / raw) To: balbir Cc: Peter Zijlstra, Andrew Morton, Vivek Goyal, nauman, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda Balbir Singh wrote: > * Peter Zijlstra <peterz@infradead.org> [2009-05-06 00:20:49]: > >> On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: >>> On Tue, 5 May 2009 15:58:27 -0400 >>> Vivek Goyal <vgoyal@redhat.com> wrote: >>> >>>> Hi All, >>>> >>>> Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. >>>> ... >>>> Currently primarily two other IO controller proposals are out there. >>>> >>>> dm-ioband >>>> --------- >>>> This patch set is from Ryo Tsuruta from valinux. >>>> ... >>>> IO-throttling >>>> ------------- >>>> This patch set is from Andrea Righi provides max bandwidth controller. >>> I'm thinking we need to lock you guys in a room and come back in 15 minutes. >>> >>> Seriously, how are we to resolve this? We could lock me in a room and >>> cmoe back in 15 days, but there's no reason to believe that I'd emerge >>> with the best answer. >>> >>> I tend to think that a cgroup-based controller is the way to go. >>> Anything else will need to be wired up to cgroups _anyway_, and that >>> might end up messy. >> FWIW I subscribe to the io-scheduler faith as opposed to the >> device-mapper cult ;-) >> >> Also, I don't think a simple throttle will be very useful, a more mature >> solution should cater to more use cases. >> > > I tend to agree, unless Andrea can prove us wrong. I don't think > throttling a task (not letting it consume CPU, memory when its IO > quota is exceeded) is a good idea. I've asked that question to Andrea > a few times, but got no response. I agree with what Balbir said about the effects of throttling on memory and cpu usage of that task. Nauman and I have been working on Vivek's set of patches (which also includes some patches by Nauman) and have been testing and developing on top of that. I've found this solution to be the one that takes us closest to a complete solution. This approach works well under the assumption that the queues are backlogged and in the limited testing that we've done so far doesn't fare that badly when they are not backlogged (though there is definitely room to improve there). With buffered writes, when the queues are not backlogged I think it might be useful to explore into vm space and see if we can do something there w/o any impact to the tasks mem or cpu usage. I don't have any brilliant ideas on this now but want to get people thinking about this. ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090506034254.GD4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506034254.GD4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> @ 2009-05-06 10:20 ` Fabio Checconi 2009-05-06 18:47 ` Divyesh Shah 2009-05-06 20:42 ` Andrea Righi 2 siblings, 0 replies; 97+ messages in thread From: Fabio Checconi @ 2009-05-06 10:20 UTC (permalink / raw) To: Balbir Singh Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton Hi, > From: Balbir Singh <balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8@public.gmane.org> > Date: Wed, May 06, 2009 09:12:54AM +0530 > > * Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> [2009-05-06 00:20:49]: > > > On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > > > On Tue, 5 May 2009 15:58:27 -0400 > > > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > > > > > > Hi All, > > > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > > ... > > > > Currently primarily two other IO controller proposals are out there. > > > > > > > > dm-ioband > > > > --------- > > > > This patch set is from Ryo Tsuruta from valinux. > > > > ... > > > > IO-throttling > > > > ------------- > > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > > > Seriously, how are we to resolve this? We could lock me in a room and > > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > > with the best answer. > > > > > > I tend to think that a cgroup-based controller is the way to go. > > > Anything else will need to be wired up to cgroups _anyway_, and that > > > might end up messy. > > > > FWIW I subscribe to the io-scheduler faith as opposed to the > > device-mapper cult ;-) > > > > Also, I don't think a simple throttle will be very useful, a more mature > > solution should cater to more use cases. > > > > I tend to agree, unless Andrea can prove us wrong. I don't think > throttling a task (not letting it consume CPU, memory when its IO > quota is exceeded) is a good idea. I've asked that question to Andrea > a few times, but got no response. > from what I can see, the principle used by io-throttling is not too different to what happens when bandwidth differentiation with synchronous access patterns is achieved using idling at the io scheduler level. When an io scheduler anticipates requests from a task/cgroup, all the other tasks with pending (synchronous) requests are in fact blocked, and the fact that the task being anticipated is allowed to submit additional io while they remain blocked is what creates the bandwidth differentiation among them. Of course there are many differences, in particular related to the latencies introduced by the two mechanisms, the granularity they use to allocate disk service, and to what throttling and proportional share io scheduling can or cannot guarantee, but FWIK both of them rely on blocking tasks to create bandwidth differentiation. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506034254.GD4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> 2009-05-06 10:20 ` Fabio Checconi @ 2009-05-06 18:47 ` Divyesh Shah 2009-05-06 20:42 ` Andrea Righi 2 siblings, 0 replies; 97+ messages in thread From: Divyesh Shah @ 2009-05-06 18:47 UTC (permalink / raw) To: balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8 Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton Balbir Singh wrote: > * Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> [2009-05-06 00:20:49]: > >> On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: >>> On Tue, 5 May 2009 15:58:27 -0400 >>> Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: >>> >>>> Hi All, >>>> >>>> Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. >>>> ... >>>> Currently primarily two other IO controller proposals are out there. >>>> >>>> dm-ioband >>>> --------- >>>> This patch set is from Ryo Tsuruta from valinux. >>>> ... >>>> IO-throttling >>>> ------------- >>>> This patch set is from Andrea Righi provides max bandwidth controller. >>> I'm thinking we need to lock you guys in a room and come back in 15 minutes. >>> >>> Seriously, how are we to resolve this? We could lock me in a room and >>> cmoe back in 15 days, but there's no reason to believe that I'd emerge >>> with the best answer. >>> >>> I tend to think that a cgroup-based controller is the way to go. >>> Anything else will need to be wired up to cgroups _anyway_, and that >>> might end up messy. >> FWIW I subscribe to the io-scheduler faith as opposed to the >> device-mapper cult ;-) >> >> Also, I don't think a simple throttle will be very useful, a more mature >> solution should cater to more use cases. >> > > I tend to agree, unless Andrea can prove us wrong. I don't think > throttling a task (not letting it consume CPU, memory when its IO > quota is exceeded) is a good idea. I've asked that question to Andrea > a few times, but got no response. I agree with what Balbir said about the effects of throttling on memory and cpu usage of that task. Nauman and I have been working on Vivek's set of patches (which also includes some patches by Nauman) and have been testing and developing on top of that. I've found this solution to be the one that takes us closest to a complete solution. This approach works well under the assumption that the queues are backlogged and in the limited testing that we've done so far doesn't fare that badly when they are not backlogged (though there is definitely room to improve there). With buffered writes, when the queues are not backlogged I think it might be useful to explore into vm space and see if we can do something there w/o any impact to the tasks mem or cpu usage. I don't have any brilliant ideas on this now but want to get people thinking about this. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506034254.GD4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> 2009-05-06 10:20 ` Fabio Checconi 2009-05-06 18:47 ` Divyesh Shah @ 2009-05-06 20:42 ` Andrea Righi 2 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-06 20:42 UTC (permalink / raw) To: Balbir Singh Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Wed, May 06, 2009 at 09:12:54AM +0530, Balbir Singh wrote: > * Peter Zijlstra <peterz-wEGCiKHe2LqWVfeAwA7xHQ@public.gmane.org> [2009-05-06 00:20:49]: > > > On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > > > On Tue, 5 May 2009 15:58:27 -0400 > > > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > > > > > > Hi All, > > > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > > ... > > > > Currently primarily two other IO controller proposals are out there. > > > > > > > > dm-ioband > > > > --------- > > > > This patch set is from Ryo Tsuruta from valinux. > > > > ... > > > > IO-throttling > > > > ------------- > > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > > > Seriously, how are we to resolve this? We could lock me in a room and > > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > > with the best answer. > > > > > > I tend to think that a cgroup-based controller is the way to go. > > > Anything else will need to be wired up to cgroups _anyway_, and that > > > might end up messy. > > > > FWIW I subscribe to the io-scheduler faith as opposed to the > > device-mapper cult ;-) > > > > Also, I don't think a simple throttle will be very useful, a more mature > > solution should cater to more use cases. > > > > I tend to agree, unless Andrea can prove us wrong. I don't think > throttling a task (not letting it consume CPU, memory when its IO > quota is exceeded) is a good idea. I've asked that question to Andrea > a few times, but got no response. Sorry Balbir, I probably missed your question. Or replied in a different thread maybe... Actually we could allow an offending cgroup to continue to submit IO requests without throttling it directly. But if we don't want to waste the memory with pending IO requests or pending writeback pages, we need to block it sooner or later. Instead of directly throttle the offending applications, we could block them when we hit a max limit of requests or dirty pages, i.e. something like congestion_wait(), but that's the same, no? the difference is that in this case throttling is asynchronous. Or am I oversimplifying it? As an example, with writeback IO io-throttle doesn't throttle the IO requests directly, each request instead receives a deadline (depending on the BW limit) and it's added into a rbtree. Then all the requests are dispatched asynchronously using a kernel thread (kiothrottled) only when the deadline is expired. OK, there's a lot of space for improvements: provide many kernel threads per block device, multiple queues/rbtrees, etc., but this is actually a way to apply throttling asynchronously. The fact is that if I don't apply the throttling also in balance_dirty_pages() (and I did so in the last io-throttle version) or add a max limit of requests the rbtree increases indefinitely... That should be very similar to the proportional BW solution allocating a quota of nr_requests per block device and per cgroup. -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 3:42 ` Balbir Singh ` (2 preceding siblings ...) [not found] ` <20090506034254.GD4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> @ 2009-05-06 20:42 ` Andrea Righi 3 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-06 20:42 UTC (permalink / raw) To: Balbir Singh Cc: Peter Zijlstra, Andrew Morton, Vivek Goyal, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda On Wed, May 06, 2009 at 09:12:54AM +0530, Balbir Singh wrote: > * Peter Zijlstra <peterz@infradead.org> [2009-05-06 00:20:49]: > > > On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > > > On Tue, 5 May 2009 15:58:27 -0400 > > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > > > > Hi All, > > > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > > ... > > > > Currently primarily two other IO controller proposals are out there. > > > > > > > > dm-ioband > > > > --------- > > > > This patch set is from Ryo Tsuruta from valinux. > > > > ... > > > > IO-throttling > > > > ------------- > > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > > > Seriously, how are we to resolve this? We could lock me in a room and > > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > > with the best answer. > > > > > > I tend to think that a cgroup-based controller is the way to go. > > > Anything else will need to be wired up to cgroups _anyway_, and that > > > might end up messy. > > > > FWIW I subscribe to the io-scheduler faith as opposed to the > > device-mapper cult ;-) > > > > Also, I don't think a simple throttle will be very useful, a more mature > > solution should cater to more use cases. > > > > I tend to agree, unless Andrea can prove us wrong. I don't think > throttling a task (not letting it consume CPU, memory when its IO > quota is exceeded) is a good idea. I've asked that question to Andrea > a few times, but got no response. Sorry Balbir, I probably missed your question. Or replied in a different thread maybe... Actually we could allow an offending cgroup to continue to submit IO requests without throttling it directly. But if we don't want to waste the memory with pending IO requests or pending writeback pages, we need to block it sooner or later. Instead of directly throttle the offending applications, we could block them when we hit a max limit of requests or dirty pages, i.e. something like congestion_wait(), but that's the same, no? the difference is that in this case throttling is asynchronous. Or am I oversimplifying it? As an example, with writeback IO io-throttle doesn't throttle the IO requests directly, each request instead receives a deadline (depending on the BW limit) and it's added into a rbtree. Then all the requests are dispatched asynchronously using a kernel thread (kiothrottled) only when the deadline is expired. OK, there's a lot of space for improvements: provide many kernel threads per block device, multiple queues/rbtrees, etc., but this is actually a way to apply throttling asynchronously. The fact is that if I don't apply the throttling also in balance_dirty_pages() (and I did so in the last io-throttle version) or add a max limit of requests the rbtree increases indefinitely... That should be very similar to the proportional BW solution allocating a quota of nr_requests per block device and per cgroup. -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-05 20:24 ` Andrew Morton (?) (?) @ 2009-05-06 2:33 ` Vivek Goyal 2009-05-06 17:59 ` Nauman Rafique ` (4 more replies) -1 siblings, 5 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 2:33 UTC (permalink / raw) To: Andrew Morton Cc: nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, peterz On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: > On Tue, 5 May 2009 15:58:27 -0400 > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > Hi All, > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > ... > > Currently primarily two other IO controller proposals are out there. > > > > dm-ioband > > --------- > > This patch set is from Ryo Tsuruta from valinux. > > ... > > IO-throttling > > ------------- > > This patch set is from Andrea Righi provides max bandwidth controller. > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > Seriously, how are we to resolve this? We could lock me in a room and > cmoe back in 15 days, but there's no reason to believe that I'd emerge > with the best answer. > > I tend to think that a cgroup-based controller is the way to go. > Anything else will need to be wired up to cgroups _anyway_, and that > might end up messy. Hi Andrew, Sorry, did not get what do you mean by cgroup based controller? If you mean that we use cgroups for grouping tasks for controlling IO, then both IO scheduler based controller as well as io throttling proposal do that. dm-ioband also supports that up to some extent but it requires extra step of transferring cgroup grouping information to dm-ioband device using dm-tools. But if you meant that io-throttle patches, then I think it solves only part of the problem and that is max bw control. It does not offer minimum BW/minimum disk share gurantees as offered by proportional BW control. IOW, it supports upper limit control and does not support a work conserving IO controller which lets a group use the whole BW if competing groups are not present. IMHO, proportional BW control is an important feature which we will need and IIUC, io-throttle patches can't be easily extended to support proportional BW control, OTOH, one should be able to extend IO scheduler based proportional weight controller to also support max bw control. Andrea, last time you were planning to have a look at my patches and see if max bw controller can be implemented there. I got a feeling that it should not be too difficult to implement it there. We already have the hierarchical tree of io queues and groups in elevator layer and we run BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is just a matter of also keeping track of IO rate per queue/group and we should be easily be able to delay the dispatch of IO from a queue if its group has crossed the specified max bw. This should lead to less code and reduced complextiy (compared with the case where we do max bw control with io-throttling patches and proportional BW control using IO scheduler based control patches). So do you think that it would make sense to do max BW control along with proportional weight IO controller at IO scheduler? If yes, then we can work together and continue to develop this patchset to also support max bw control and meet your requirements and drop the io-throttling patches. The only thing which concerns me is the fact that IO scheduler does not have the view of higher level logical device. So if somebody has setup a software RAID and wants to put max BW limit on software raid device, this solution will not work. One shall have to live with max bw limits on individual disks (where io scheduler is actually running). Do your patches allow to put limit on software RAID devices also? Ryo, dm-ioband breaks the notion of classes and priority of CFQ because of FIFO dispatch of buffered bios. Apart from that it tries to provide fairness in terms of actual IO done and that would mean a seeky workload will can use disk for much longer to get equivalent IO done and slow down other applications. Implementing IO controller at IO scheduler level gives us tigher control. Will it not meet your requirements? If you got specific concerns with IO scheduler based contol patches, please highlight these and we will see how these can be addressed. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 2:33 ` Vivek Goyal @ 2009-05-06 17:59 ` Nauman Rafique 2009-05-06 20:07 ` Andrea Righi ` (3 subsequent siblings) 4 siblings, 0 replies; 97+ messages in thread From: Nauman Rafique @ 2009-05-06 17:59 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, peterz On Tue, May 5, 2009 at 7:33 PM, Vivek Goyal <vgoyal@redhat.com> wrote: > On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: >> On Tue, 5 May 2009 15:58:27 -0400 >> Vivek Goyal <vgoyal@redhat.com> wrote: >> >> > >> > Hi All, >> > >> > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. >> > ... >> > Currently primarily two other IO controller proposals are out there. >> > >> > dm-ioband >> > --------- >> > This patch set is from Ryo Tsuruta from valinux. >> > ... >> > IO-throttling >> > ------------- >> > This patch set is from Andrea Righi provides max bandwidth controller. >> >> I'm thinking we need to lock you guys in a room and come back in 15 minutes. >> >> Seriously, how are we to resolve this? We could lock me in a room and >> cmoe back in 15 days, but there's no reason to believe that I'd emerge >> with the best answer. >> >> I tend to think that a cgroup-based controller is the way to go. >> Anything else will need to be wired up to cgroups _anyway_, and that >> might end up messy. > > Hi Andrew, > > Sorry, did not get what do you mean by cgroup based controller? If you > mean that we use cgroups for grouping tasks for controlling IO, then both > IO scheduler based controller as well as io throttling proposal do that. > dm-ioband also supports that up to some extent but it requires extra step of > transferring cgroup grouping information to dm-ioband device using dm-tools. > > But if you meant that io-throttle patches, then I think it solves only > part of the problem and that is max bw control. It does not offer minimum > BW/minimum disk share gurantees as offered by proportional BW control. > > IOW, it supports upper limit control and does not support a work conserving > IO controller which lets a group use the whole BW if competing groups are > not present. IMHO, proportional BW control is an important feature which > we will need and IIUC, io-throttle patches can't be easily extended to support > proportional BW control, OTOH, one should be able to extend IO scheduler > based proportional weight controller to also support max bw control. > > Andrea, last time you were planning to have a look at my patches and see > if max bw controller can be implemented there. I got a feeling that it > should not be too difficult to implement it there. We already have the > hierarchical tree of io queues and groups in elevator layer and we run > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > just a matter of also keeping track of IO rate per queue/group and we should > be easily be able to delay the dispatch of IO from a queue if its group has > crossed the specified max bw. > > This should lead to less code and reduced complextiy (compared with the > case where we do max bw control with io-throttling patches and proportional > BW control using IO scheduler based control patches). > > So do you think that it would make sense to do max BW control along with > proportional weight IO controller at IO scheduler? If yes, then we can > work together and continue to develop this patchset to also support max > bw control and meet your requirements and drop the io-throttling patches. > > The only thing which concerns me is the fact that IO scheduler does not > have the view of higher level logical device. So if somebody has setup a > software RAID and wants to put max BW limit on software raid device, this > solution will not work. One shall have to live with max bw limits on > individual disks (where io scheduler is actually running). Do your patches > allow to put limit on software RAID devices also? > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > of FIFO dispatch of buffered bios. Apart from that it tries to provide > fairness in terms of actual IO done and that would mean a seeky workload > will can use disk for much longer to get equivalent IO done and slow down > other applications. Implementing IO controller at IO scheduler level gives > us tigher control. Will it not meet your requirements? If you got specific > concerns with IO scheduler based contol patches, please highlight these and > we will see how these can be addressed. In my opinion, IO throttling and dm-ioband are probably simpler, but incomplete solutions to the problem. And for a solution to be complete, it would have to be at a IO scheduler layer so it can do things like taking an IO as soon as it comes and stick it to the front of all the queues so that it can go to the disk right away. This patch set is big, but it takes us in the right direction. Our ultimate goal should be able to reach the level of control that we can have over CPU and network resources. And I don't think IO throttling and dm-ioband approaches take us in that direction. > > Thanks > Vivek > ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 2:33 ` Vivek Goyal 2009-05-06 17:59 ` Nauman Rafique @ 2009-05-06 20:07 ` Andrea Righi 2009-05-06 21:21 ` Vivek Goyal 2009-05-06 21:21 ` Vivek Goyal [not found] ` <20090506023332.GA1212-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ` (2 subsequent siblings) 4 siblings, 2 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-06 20:07 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Tue, May 05, 2009 at 10:33:32PM -0400, Vivek Goyal wrote: > On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: > > On Tue, 5 May 2009 15:58:27 -0400 > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > Hi All, > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > ... > > > Currently primarily two other IO controller proposals are out there. > > > > > > dm-ioband > > > --------- > > > This patch set is from Ryo Tsuruta from valinux. > > > ... > > > IO-throttling > > > ------------- > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > Seriously, how are we to resolve this? We could lock me in a room and > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > with the best answer. > > > > I tend to think that a cgroup-based controller is the way to go. > > Anything else will need to be wired up to cgroups _anyway_, and that > > might end up messy. > > Hi Andrew, > > Sorry, did not get what do you mean by cgroup based controller? If you > mean that we use cgroups for grouping tasks for controlling IO, then both > IO scheduler based controller as well as io throttling proposal do that. > dm-ioband also supports that up to some extent but it requires extra step of > transferring cgroup grouping information to dm-ioband device using dm-tools. > > But if you meant that io-throttle patches, then I think it solves only > part of the problem and that is max bw control. It does not offer minimum > BW/minimum disk share gurantees as offered by proportional BW control. > > IOW, it supports upper limit control and does not support a work conserving > IO controller which lets a group use the whole BW if competing groups are > not present. IMHO, proportional BW control is an important feature which > we will need and IIUC, io-throttle patches can't be easily extended to support > proportional BW control, OTOH, one should be able to extend IO scheduler > based proportional weight controller to also support max bw control. Well, IMHO the big concern is at which level we want to implement the logic of control: IO scheduler, when the IO requests are already submitted and need to be dispatched, or at high level when the applications generates IO requests (or maybe both). And, as pointed by Andrew, do everything by a cgroup-based controller. The other features, proportional BW, throttling, take the current ioprio model in account, etc. are implementation details and any of the proposed solutions can be extended to support all these features. I mean, io-throttle can be extended to support proportional BW (for a certain perspective it is already provided by the throttling water mark in v16), as well as the IO scheduler based controller can be extended to support absolute BW limits. The same for dm-ioband. I don't think there're huge obstacle to merge the functionalities in this sense. > > Andrea, last time you were planning to have a look at my patches and see > if max bw controller can be implemented there. I got a feeling that it > should not be too difficult to implement it there. We already have the > hierarchical tree of io queues and groups in elevator layer and we run > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > just a matter of also keeping track of IO rate per queue/group and we should > be easily be able to delay the dispatch of IO from a queue if its group has > crossed the specified max bw. Yes, sorry for my late, I quickly tested your patchset, but I still need to understand many details of your solution. In the next days I'll re-read everything carefully and I'll try to do a detailed review of your patchset (just re-building the kernel with your patchset applied). > > This should lead to less code and reduced complextiy (compared with the > case where we do max bw control with io-throttling patches and proportional > BW control using IO scheduler based control patches). mmmh... changing the logic at the elevator and all IO schedulers doesn't sound like reduced complexity and less code changed. With io-throttle we just need to place the cgroup_io_throttle() hook in the right functions where we want to apply throttling. This is a quite easy approach to extend the IO control also to logical devices (more in general devices that use their own make_request_fn) or even network-attached devices, as well as networking filesystems, etc. But I may be wrong. As I said I still need to review in the details your solution. > > So do you think that it would make sense to do max BW control along with > proportional weight IO controller at IO scheduler? If yes, then we can > work together and continue to develop this patchset to also support max > bw control and meet your requirements and drop the io-throttling patches. It is surely worth to be explored. Honestly, I don't know if it would be a better solution or not. Probably comparing some results with different IO workloads is the best way to proceed and decide which is the right way to go. This is necessary IMHO, before totally dropping one solution or another. > > The only thing which concerns me is the fact that IO scheduler does not > have the view of higher level logical device. So if somebody has setup a > software RAID and wants to put max BW limit on software raid device, this > solution will not work. One shall have to live with max bw limits on > individual disks (where io scheduler is actually running). Do your patches > allow to put limit on software RAID devices also? No, but as said above my patchset provides the interfaces to apply the IO control and accounting wherever we want. At the moment there's just one interface, cgroup_io_throttle(). -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 20:07 ` Andrea Righi @ 2009-05-06 21:21 ` Vivek Goyal 2009-05-06 21:21 ` Vivek Goyal 1 sibling, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 21:21 UTC (permalink / raw) To: Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Wed, May 06, 2009 at 10:07:53PM +0200, Andrea Righi wrote: > On Tue, May 05, 2009 at 10:33:32PM -0400, Vivek Goyal wrote: > > On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: > > > On Tue, 5 May 2009 15:58:27 -0400 > > > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > > > > > > Hi All, > > > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > > ... > > > > Currently primarily two other IO controller proposals are out there. > > > > > > > > dm-ioband > > > > --------- > > > > This patch set is from Ryo Tsuruta from valinux. > > > > ... > > > > IO-throttling > > > > ------------- > > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > > > Seriously, how are we to resolve this? We could lock me in a room and > > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > > with the best answer. > > > > > > I tend to think that a cgroup-based controller is the way to go. > > > Anything else will need to be wired up to cgroups _anyway_, and that > > > might end up messy. > > > > Hi Andrew, > > > > Sorry, did not get what do you mean by cgroup based controller? If you > > mean that we use cgroups for grouping tasks for controlling IO, then both > > IO scheduler based controller as well as io throttling proposal do that. > > dm-ioband also supports that up to some extent but it requires extra step of > > transferring cgroup grouping information to dm-ioband device using dm-tools. > > > > But if you meant that io-throttle patches, then I think it solves only > > part of the problem and that is max bw control. It does not offer minimum > > BW/minimum disk share gurantees as offered by proportional BW control. > > > > IOW, it supports upper limit control and does not support a work conserving > > IO controller which lets a group use the whole BW if competing groups are > > not present. IMHO, proportional BW control is an important feature which > > we will need and IIUC, io-throttle patches can't be easily extended to support > > proportional BW control, OTOH, one should be able to extend IO scheduler > > based proportional weight controller to also support max bw control. > > Well, IMHO the big concern is at which level we want to implement the > logic of control: IO scheduler, when the IO requests are already > submitted and need to be dispatched, or at high level when the > applications generates IO requests (or maybe both). > > And, as pointed by Andrew, do everything by a cgroup-based controller. I am not sure what's the rationale behind that. Why to do it at higher layer? Doing it at IO scheduler layer will make sure that one does not breaks the IO scheduler's properties with-in cgroup. (See my other mail with some io-throttling test results). The advantage of higher layer mechanism is that it can also cover software RAID devices well. > > The other features, proportional BW, throttling, take the current ioprio > model in account, etc. are implementation details and any of the > proposed solutions can be extended to support all these features. I > mean, io-throttle can be extended to support proportional BW (for a > certain perspective it is already provided by the throttling water mark > in v16), as well as the IO scheduler based controller can be extended to > support absolute BW limits. The same for dm-ioband. I don't think > there're huge obstacle to merge the functionalities in this sense. Yes, from technical point of view, one can implement a proportional BW controller at higher layer also. But that would practically mean almost re-implementing the CFQ logic at higher layer. Now why to get into all that complexity. Why not simply make CFQ hiearchical to also handle the groups? Secondly, think of following odd scenarios if we implement a higher level proportional BW controller which can offer the same feature as CFQ and also can handle group scheduling. Case1: ====== (Higher level proportional BW controller) /dev/sda (CFQ) So if somebody wants a group scheduling, we will be doing same IO control at two places (with-in group). Once at higher level and second time at CFQ level. Does not sound too logical to me. Case2: ====== (Higher level proportional BW controller) /dev/sda (NOOP) This is other extrememt. Lower level IO scheduler does not offer any kind of notion of class or prio with-in class and higher level scheduler will still be maintaining all the infrastructure unnecessarily. That's why I get back to this simple question again, why not extend the IO schedulers to handle group scheduling and do both proportional BW and max bw control there. > > > > > Andrea, last time you were planning to have a look at my patches and see > > if max bw controller can be implemented there. I got a feeling that it > > should not be too difficult to implement it there. We already have the > > hierarchical tree of io queues and groups in elevator layer and we run > > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > > just a matter of also keeping track of IO rate per queue/group and we should > > be easily be able to delay the dispatch of IO from a queue if its group has > > crossed the specified max bw. > > Yes, sorry for my late, I quickly tested your patchset, but I still need > to understand many details of your solution. In the next days I'll > re-read everything carefully and I'll try to do a detailed review of > your patchset (just re-building the kernel with your patchset applied). > Sure. My patchset is still in the infancy stage. So don't expect great results. But it does highlight the idea and design very well. > > > > This should lead to less code and reduced complextiy (compared with the > > case where we do max bw control with io-throttling patches and proportional > > BW control using IO scheduler based control patches). > > mmmh... changing the logic at the elevator and all IO schedulers doesn't > sound like reduced complexity and less code changed. With io-throttle we > just need to place the cgroup_io_throttle() hook in the right functions > where we want to apply throttling. This is a quite easy approach to > extend the IO control also to logical devices (more in general devices > that use their own make_request_fn) or even network-attached devices, as > well as networking filesystems, etc. > > But I may be wrong. As I said I still need to review in the details your > solution. Well I meant reduced code in the sense if we implement both max bw and proportional bw at IO scheduler level instead of proportional BW at IO scheduler and max bw at higher level. I agree that doing max bw control at higher level has this advantage that it covers all the kind of deivces (higher level logical devices) and IO scheduler level solution does not do that. But this comes at the price of broken IO scheduler properties with-in cgroup. Maybe we can then implement both. A higher level max bw controller and a max bw feature implemented along side proportional BW controller at IO scheduler level. Folks who use hardware RAID, or single disk devices can use max bw control of IO scheduler and those using software RAID devices can use higher level max bw controller. > > > > > So do you think that it would make sense to do max BW control along with > > proportional weight IO controller at IO scheduler? If yes, then we can > > work together and continue to develop this patchset to also support max > > bw control and meet your requirements and drop the io-throttling patches. > > It is surely worth to be explored. Honestly, I don't know if it would be > a better solution or not. Probably comparing some results with different > IO workloads is the best way to proceed and decide which is the right > way to go. This is necessary IMHO, before totally dropping one solution > or another. Sure. My patches have started giving some basic results but because there is lot of work remaining before a fair comparison can be done on the basis of performance under various work loads. So some more time to go before we can do a fair comparison based on numbers. > > > > > The only thing which concerns me is the fact that IO scheduler does not > > have the view of higher level logical device. So if somebody has setup a > > software RAID and wants to put max BW limit on software raid device, this > > solution will not work. One shall have to live with max bw limits on > > individual disks (where io scheduler is actually running). Do your patches > > allow to put limit on software RAID devices also? > > No, but as said above my patchset provides the interfaces to apply the > IO control and accounting wherever we want. At the moment there's just > one interface, cgroup_io_throttle(). Sorry, I did not get it clearly. I guess I did not ask the question right. So lets say I got a setup where there are two phyical devices /dev/sda and /dev/sdb and I create a logical device (say using device mapper facilities) on top of these two physical disks. And some application is generating the IO for logical device lv0. Appl | lv0 / \ sda sdb Where should I put the bandwidth limiting rules now for io-throtle. I specify these for lv0 device or for sda and sdb devices? Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 20:07 ` Andrea Righi 2009-05-06 21:21 ` Vivek Goyal @ 2009-05-06 21:21 ` Vivek Goyal [not found] ` <20090506212121.GI8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 21:21 UTC (permalink / raw) To: Andrea Righi Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Wed, May 06, 2009 at 10:07:53PM +0200, Andrea Righi wrote: > On Tue, May 05, 2009 at 10:33:32PM -0400, Vivek Goyal wrote: > > On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: > > > On Tue, 5 May 2009 15:58:27 -0400 > > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > > > > Hi All, > > > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > > ... > > > > Currently primarily two other IO controller proposals are out there. > > > > > > > > dm-ioband > > > > --------- > > > > This patch set is from Ryo Tsuruta from valinux. > > > > ... > > > > IO-throttling > > > > ------------- > > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > > > Seriously, how are we to resolve this? We could lock me in a room and > > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > > with the best answer. > > > > > > I tend to think that a cgroup-based controller is the way to go. > > > Anything else will need to be wired up to cgroups _anyway_, and that > > > might end up messy. > > > > Hi Andrew, > > > > Sorry, did not get what do you mean by cgroup based controller? If you > > mean that we use cgroups for grouping tasks for controlling IO, then both > > IO scheduler based controller as well as io throttling proposal do that. > > dm-ioband also supports that up to some extent but it requires extra step of > > transferring cgroup grouping information to dm-ioband device using dm-tools. > > > > But if you meant that io-throttle patches, then I think it solves only > > part of the problem and that is max bw control. It does not offer minimum > > BW/minimum disk share gurantees as offered by proportional BW control. > > > > IOW, it supports upper limit control and does not support a work conserving > > IO controller which lets a group use the whole BW if competing groups are > > not present. IMHO, proportional BW control is an important feature which > > we will need and IIUC, io-throttle patches can't be easily extended to support > > proportional BW control, OTOH, one should be able to extend IO scheduler > > based proportional weight controller to also support max bw control. > > Well, IMHO the big concern is at which level we want to implement the > logic of control: IO scheduler, when the IO requests are already > submitted and need to be dispatched, or at high level when the > applications generates IO requests (or maybe both). > > And, as pointed by Andrew, do everything by a cgroup-based controller. I am not sure what's the rationale behind that. Why to do it at higher layer? Doing it at IO scheduler layer will make sure that one does not breaks the IO scheduler's properties with-in cgroup. (See my other mail with some io-throttling test results). The advantage of higher layer mechanism is that it can also cover software RAID devices well. > > The other features, proportional BW, throttling, take the current ioprio > model in account, etc. are implementation details and any of the > proposed solutions can be extended to support all these features. I > mean, io-throttle can be extended to support proportional BW (for a > certain perspective it is already provided by the throttling water mark > in v16), as well as the IO scheduler based controller can be extended to > support absolute BW limits. The same for dm-ioband. I don't think > there're huge obstacle to merge the functionalities in this sense. Yes, from technical point of view, one can implement a proportional BW controller at higher layer also. But that would practically mean almost re-implementing the CFQ logic at higher layer. Now why to get into all that complexity. Why not simply make CFQ hiearchical to also handle the groups? Secondly, think of following odd scenarios if we implement a higher level proportional BW controller which can offer the same feature as CFQ and also can handle group scheduling. Case1: ====== (Higher level proportional BW controller) /dev/sda (CFQ) So if somebody wants a group scheduling, we will be doing same IO control at two places (with-in group). Once at higher level and second time at CFQ level. Does not sound too logical to me. Case2: ====== (Higher level proportional BW controller) /dev/sda (NOOP) This is other extrememt. Lower level IO scheduler does not offer any kind of notion of class or prio with-in class and higher level scheduler will still be maintaining all the infrastructure unnecessarily. That's why I get back to this simple question again, why not extend the IO schedulers to handle group scheduling and do both proportional BW and max bw control there. > > > > > Andrea, last time you were planning to have a look at my patches and see > > if max bw controller can be implemented there. I got a feeling that it > > should not be too difficult to implement it there. We already have the > > hierarchical tree of io queues and groups in elevator layer and we run > > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > > just a matter of also keeping track of IO rate per queue/group and we should > > be easily be able to delay the dispatch of IO from a queue if its group has > > crossed the specified max bw. > > Yes, sorry for my late, I quickly tested your patchset, but I still need > to understand many details of your solution. In the next days I'll > re-read everything carefully and I'll try to do a detailed review of > your patchset (just re-building the kernel with your patchset applied). > Sure. My patchset is still in the infancy stage. So don't expect great results. But it does highlight the idea and design very well. > > > > This should lead to less code and reduced complextiy (compared with the > > case where we do max bw control with io-throttling patches and proportional > > BW control using IO scheduler based control patches). > > mmmh... changing the logic at the elevator and all IO schedulers doesn't > sound like reduced complexity and less code changed. With io-throttle we > just need to place the cgroup_io_throttle() hook in the right functions > where we want to apply throttling. This is a quite easy approach to > extend the IO control also to logical devices (more in general devices > that use their own make_request_fn) or even network-attached devices, as > well as networking filesystems, etc. > > But I may be wrong. As I said I still need to review in the details your > solution. Well I meant reduced code in the sense if we implement both max bw and proportional bw at IO scheduler level instead of proportional BW at IO scheduler and max bw at higher level. I agree that doing max bw control at higher level has this advantage that it covers all the kind of deivces (higher level logical devices) and IO scheduler level solution does not do that. But this comes at the price of broken IO scheduler properties with-in cgroup. Maybe we can then implement both. A higher level max bw controller and a max bw feature implemented along side proportional BW controller at IO scheduler level. Folks who use hardware RAID, or single disk devices can use max bw control of IO scheduler and those using software RAID devices can use higher level max bw controller. > > > > > So do you think that it would make sense to do max BW control along with > > proportional weight IO controller at IO scheduler? If yes, then we can > > work together and continue to develop this patchset to also support max > > bw control and meet your requirements and drop the io-throttling patches. > > It is surely worth to be explored. Honestly, I don't know if it would be > a better solution or not. Probably comparing some results with different > IO workloads is the best way to proceed and decide which is the right > way to go. This is necessary IMHO, before totally dropping one solution > or another. Sure. My patches have started giving some basic results but because there is lot of work remaining before a fair comparison can be done on the basis of performance under various work loads. So some more time to go before we can do a fair comparison based on numbers. > > > > > The only thing which concerns me is the fact that IO scheduler does not > > have the view of higher level logical device. So if somebody has setup a > > software RAID and wants to put max BW limit on software raid device, this > > solution will not work. One shall have to live with max bw limits on > > individual disks (where io scheduler is actually running). Do your patches > > allow to put limit on software RAID devices also? > > No, but as said above my patchset provides the interfaces to apply the > IO control and accounting wherever we want. At the moment there's just > one interface, cgroup_io_throttle(). Sorry, I did not get it clearly. I guess I did not ask the question right. So lets say I got a setup where there are two phyical devices /dev/sda and /dev/sdb and I create a logical device (say using device mapper facilities) on top of these two physical disks. And some application is generating the IO for logical device lv0. Appl | lv0 / \ sda sdb Where should I put the bandwidth limiting rules now for io-throtle. I specify these for lv0 device or for sda and sdb devices? Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090506212121.GI8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 2009-05-06 21:21 ` Vivek Goyal @ 2009-05-06 22:02 ` Andrea Righi 0 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-06 22:02 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Wed, May 06, 2009 at 05:21:21PM -0400, Vivek Goyal wrote: > > Well, IMHO the big concern is at which level we want to implement the > > logic of control: IO scheduler, when the IO requests are already > > submitted and need to be dispatched, or at high level when the > > applications generates IO requests (or maybe both). > > > > And, as pointed by Andrew, do everything by a cgroup-based controller. > > I am not sure what's the rationale behind that. Why to do it at higher > layer? Doing it at IO scheduler layer will make sure that one does not > breaks the IO scheduler's properties with-in cgroup. (See my other mail > with some io-throttling test results). > > The advantage of higher layer mechanism is that it can also cover software > RAID devices well. > > > > > The other features, proportional BW, throttling, take the current ioprio > > model in account, etc. are implementation details and any of the > > proposed solutions can be extended to support all these features. I > > mean, io-throttle can be extended to support proportional BW (for a > > certain perspective it is already provided by the throttling water mark > > in v16), as well as the IO scheduler based controller can be extended to > > support absolute BW limits. The same for dm-ioband. I don't think > > there're huge obstacle to merge the functionalities in this sense. > > Yes, from technical point of view, one can implement a proportional BW > controller at higher layer also. But that would practically mean almost > re-implementing the CFQ logic at higher layer. Now why to get into all > that complexity. Why not simply make CFQ hiearchical to also handle the > groups? Make CFQ aware of cgroups is very important also. I could be wrong, but I don't think we shouldn't re-implement the same exact CFQ logic at higher layers. CFQ dispatches IO requests, at higher layers applications submit IO requests. We're talking about different things and applying different logic doesn't sound too strange IMHO. I mean, at least we should consider/test also this different approach before deciding drop it. This solution also guarantee no changes in the IO schedulers for those who are not interested in using the cgroup IO controller. What is the impact of the IO scheduler based controller for those users? > > Secondly, think of following odd scenarios if we implement a higher level > proportional BW controller which can offer the same feature as CFQ and > also can handle group scheduling. > > Case1: > ====== > (Higher level proportional BW controller) > /dev/sda (CFQ) > > So if somebody wants a group scheduling, we will be doing same IO control > at two places (with-in group). Once at higher level and second time at CFQ > level. Does not sound too logical to me. > > Case2: > ====== > > (Higher level proportional BW controller) > /dev/sda (NOOP) > > This is other extrememt. Lower level IO scheduler does not offer any kind > of notion of class or prio with-in class and higher level scheduler will > still be maintaining all the infrastructure unnecessarily. > > That's why I get back to this simple question again, why not extend the > IO schedulers to handle group scheduling and do both proportional BW and > max bw control there. > > > > > > > > > Andrea, last time you were planning to have a look at my patches and see > > > if max bw controller can be implemented there. I got a feeling that it > > > should not be too difficult to implement it there. We already have the > > > hierarchical tree of io queues and groups in elevator layer and we run > > > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > > > just a matter of also keeping track of IO rate per queue/group and we should > > > be easily be able to delay the dispatch of IO from a queue if its group has > > > crossed the specified max bw. > > > > Yes, sorry for my late, I quickly tested your patchset, but I still need > > to understand many details of your solution. In the next days I'll > > re-read everything carefully and I'll try to do a detailed review of > > your patchset (just re-building the kernel with your patchset applied). > > > > Sure. My patchset is still in the infancy stage. So don't expect great > results. But it does highlight the idea and design very well. > > > > > > > This should lead to less code and reduced complextiy (compared with the > > > case where we do max bw control with io-throttling patches and proportional > > > BW control using IO scheduler based control patches). > > > > mmmh... changing the logic at the elevator and all IO schedulers doesn't > > sound like reduced complexity and less code changed. With io-throttle we > > just need to place the cgroup_io_throttle() hook in the right functions > > where we want to apply throttling. This is a quite easy approach to > > extend the IO control also to logical devices (more in general devices > > that use their own make_request_fn) or even network-attached devices, as > > well as networking filesystems, etc. > > > > But I may be wrong. As I said I still need to review in the details your > > solution. > > Well I meant reduced code in the sense if we implement both max bw and > proportional bw at IO scheduler level instead of proportional BW at > IO scheduler and max bw at higher level. OK. > > I agree that doing max bw control at higher level has this advantage that > it covers all the kind of deivces (higher level logical devices) and IO > scheduler level solution does not do that. But this comes at the price > of broken IO scheduler properties with-in cgroup. > > Maybe we can then implement both. A higher level max bw controller and a > max bw feature implemented along side proportional BW controller at IO > scheduler level. Folks who use hardware RAID, or single disk devices can > use max bw control of IO scheduler and those using software RAID devices > can use higher level max bw controller. OK, maybe. > > > > > > > > > So do you think that it would make sense to do max BW control along with > > > proportional weight IO controller at IO scheduler? If yes, then we can > > > work together and continue to develop this patchset to also support max > > > bw control and meet your requirements and drop the io-throttling patches. > > > > It is surely worth to be explored. Honestly, I don't know if it would be > > a better solution or not. Probably comparing some results with different > > IO workloads is the best way to proceed and decide which is the right > > way to go. This is necessary IMHO, before totally dropping one solution > > or another. > > Sure. My patches have started giving some basic results but because there > is lot of work remaining before a fair comparison can be done on the > basis of performance under various work loads. So some more time to > go before we can do a fair comparison based on numbers. > > > > > > > > > The only thing which concerns me is the fact that IO scheduler does not > > > have the view of higher level logical device. So if somebody has setup a > > > software RAID and wants to put max BW limit on software raid device, this > > > solution will not work. One shall have to live with max bw limits on > > > individual disks (where io scheduler is actually running). Do your patches > > > allow to put limit on software RAID devices also? > > > > No, but as said above my patchset provides the interfaces to apply the > > IO control and accounting wherever we want. At the moment there's just > > one interface, cgroup_io_throttle(). > > Sorry, I did not get it clearly. I guess I did not ask the question right. > So lets say I got a setup where there are two phyical devices /dev/sda and > /dev/sdb and I create a logical device (say using device mapper facilities) > on top of these two physical disks. And some application is generating > the IO for logical device lv0. > > Appl > | > lv0 > / \ > sda sdb > > > Where should I put the bandwidth limiting rules now for io-throtle. I > specify these for lv0 device or for sda and sdb devices? The BW limiting rules would be applied into the make_request_fn provided by the lv0 device. If it's not provided, before calling generic_make_request(). A problem could be that the driver must be aware of the particular lv0 device at that point. > > Thanks > Vivek OK. I definitely need to look at your patchset before saying any other opinion... :) Thanks, -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-06 22:02 ` Andrea Righi 0 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-06 22:02 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Wed, May 06, 2009 at 05:21:21PM -0400, Vivek Goyal wrote: > > Well, IMHO the big concern is at which level we want to implement the > > logic of control: IO scheduler, when the IO requests are already > > submitted and need to be dispatched, or at high level when the > > applications generates IO requests (or maybe both). > > > > And, as pointed by Andrew, do everything by a cgroup-based controller. > > I am not sure what's the rationale behind that. Why to do it at higher > layer? Doing it at IO scheduler layer will make sure that one does not > breaks the IO scheduler's properties with-in cgroup. (See my other mail > with some io-throttling test results). > > The advantage of higher layer mechanism is that it can also cover software > RAID devices well. > > > > > The other features, proportional BW, throttling, take the current ioprio > > model in account, etc. are implementation details and any of the > > proposed solutions can be extended to support all these features. I > > mean, io-throttle can be extended to support proportional BW (for a > > certain perspective it is already provided by the throttling water mark > > in v16), as well as the IO scheduler based controller can be extended to > > support absolute BW limits. The same for dm-ioband. I don't think > > there're huge obstacle to merge the functionalities in this sense. > > Yes, from technical point of view, one can implement a proportional BW > controller at higher layer also. But that would practically mean almost > re-implementing the CFQ logic at higher layer. Now why to get into all > that complexity. Why not simply make CFQ hiearchical to also handle the > groups? Make CFQ aware of cgroups is very important also. I could be wrong, but I don't think we shouldn't re-implement the same exact CFQ logic at higher layers. CFQ dispatches IO requests, at higher layers applications submit IO requests. We're talking about different things and applying different logic doesn't sound too strange IMHO. I mean, at least we should consider/test also this different approach before deciding drop it. This solution also guarantee no changes in the IO schedulers for those who are not interested in using the cgroup IO controller. What is the impact of the IO scheduler based controller for those users? > > Secondly, think of following odd scenarios if we implement a higher level > proportional BW controller which can offer the same feature as CFQ and > also can handle group scheduling. > > Case1: > ====== > (Higher level proportional BW controller) > /dev/sda (CFQ) > > So if somebody wants a group scheduling, we will be doing same IO control > at two places (with-in group). Once at higher level and second time at CFQ > level. Does not sound too logical to me. > > Case2: > ====== > > (Higher level proportional BW controller) > /dev/sda (NOOP) > > This is other extrememt. Lower level IO scheduler does not offer any kind > of notion of class or prio with-in class and higher level scheduler will > still be maintaining all the infrastructure unnecessarily. > > That's why I get back to this simple question again, why not extend the > IO schedulers to handle group scheduling and do both proportional BW and > max bw control there. > > > > > > > > > Andrea, last time you were planning to have a look at my patches and see > > > if max bw controller can be implemented there. I got a feeling that it > > > should not be too difficult to implement it there. We already have the > > > hierarchical tree of io queues and groups in elevator layer and we run > > > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > > > just a matter of also keeping track of IO rate per queue/group and we should > > > be easily be able to delay the dispatch of IO from a queue if its group has > > > crossed the specified max bw. > > > > Yes, sorry for my late, I quickly tested your patchset, but I still need > > to understand many details of your solution. In the next days I'll > > re-read everything carefully and I'll try to do a detailed review of > > your patchset (just re-building the kernel with your patchset applied). > > > > Sure. My patchset is still in the infancy stage. So don't expect great > results. But it does highlight the idea and design very well. > > > > > > > This should lead to less code and reduced complextiy (compared with the > > > case where we do max bw control with io-throttling patches and proportional > > > BW control using IO scheduler based control patches). > > > > mmmh... changing the logic at the elevator and all IO schedulers doesn't > > sound like reduced complexity and less code changed. With io-throttle we > > just need to place the cgroup_io_throttle() hook in the right functions > > where we want to apply throttling. This is a quite easy approach to > > extend the IO control also to logical devices (more in general devices > > that use their own make_request_fn) or even network-attached devices, as > > well as networking filesystems, etc. > > > > But I may be wrong. As I said I still need to review in the details your > > solution. > > Well I meant reduced code in the sense if we implement both max bw and > proportional bw at IO scheduler level instead of proportional BW at > IO scheduler and max bw at higher level. OK. > > I agree that doing max bw control at higher level has this advantage that > it covers all the kind of deivces (higher level logical devices) and IO > scheduler level solution does not do that. But this comes at the price > of broken IO scheduler properties with-in cgroup. > > Maybe we can then implement both. A higher level max bw controller and a > max bw feature implemented along side proportional BW controller at IO > scheduler level. Folks who use hardware RAID, or single disk devices can > use max bw control of IO scheduler and those using software RAID devices > can use higher level max bw controller. OK, maybe. > > > > > > > > > So do you think that it would make sense to do max BW control along with > > > proportional weight IO controller at IO scheduler? If yes, then we can > > > work together and continue to develop this patchset to also support max > > > bw control and meet your requirements and drop the io-throttling patches. > > > > It is surely worth to be explored. Honestly, I don't know if it would be > > a better solution or not. Probably comparing some results with different > > IO workloads is the best way to proceed and decide which is the right > > way to go. This is necessary IMHO, before totally dropping one solution > > or another. > > Sure. My patches have started giving some basic results but because there > is lot of work remaining before a fair comparison can be done on the > basis of performance under various work loads. So some more time to > go before we can do a fair comparison based on numbers. > > > > > > > > > The only thing which concerns me is the fact that IO scheduler does not > > > have the view of higher level logical device. So if somebody has setup a > > > software RAID and wants to put max BW limit on software raid device, this > > > solution will not work. One shall have to live with max bw limits on > > > individual disks (where io scheduler is actually running). Do your patches > > > allow to put limit on software RAID devices also? > > > > No, but as said above my patchset provides the interfaces to apply the > > IO control and accounting wherever we want. At the moment there's just > > one interface, cgroup_io_throttle(). > > Sorry, I did not get it clearly. I guess I did not ask the question right. > So lets say I got a setup where there are two phyical devices /dev/sda and > /dev/sdb and I create a logical device (say using device mapper facilities) > on top of these two physical disks. And some application is generating > the IO for logical device lv0. > > Appl > | > lv0 > / \ > sda sdb > > > Where should I put the bandwidth limiting rules now for io-throtle. I > specify these for lv0 device or for sda and sdb devices? The BW limiting rules would be applied into the make_request_fn provided by the lv0 device. If it's not provided, before calling generic_make_request(). A problem could be that the driver must be aware of the particular lv0 device at that point. > > Thanks > Vivek OK. I definitely need to look at your patchset before saying any other opinion... :) Thanks, -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 22:02 ` Andrea Righi @ 2009-05-06 22:17 ` Vivek Goyal -1 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 22:17 UTC (permalink / raw) To: Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Thu, May 07, 2009 at 12:02:51AM +0200, Andrea Righi wrote: > On Wed, May 06, 2009 at 05:21:21PM -0400, Vivek Goyal wrote: > > > Well, IMHO the big concern is at which level we want to implement the > > > logic of control: IO scheduler, when the IO requests are already > > > submitted and need to be dispatched, or at high level when the > > > applications generates IO requests (or maybe both). > > > > > > And, as pointed by Andrew, do everything by a cgroup-based controller. > > > > I am not sure what's the rationale behind that. Why to do it at higher > > layer? Doing it at IO scheduler layer will make sure that one does not > > breaks the IO scheduler's properties with-in cgroup. (See my other mail > > with some io-throttling test results). > > > > The advantage of higher layer mechanism is that it can also cover software > > RAID devices well. > > > > > > > > The other features, proportional BW, throttling, take the current ioprio > > > model in account, etc. are implementation details and any of the > > > proposed solutions can be extended to support all these features. I > > > mean, io-throttle can be extended to support proportional BW (for a > > > certain perspective it is already provided by the throttling water mark > > > in v16), as well as the IO scheduler based controller can be extended to > > > support absolute BW limits. The same for dm-ioband. I don't think > > > there're huge obstacle to merge the functionalities in this sense. > > > > Yes, from technical point of view, one can implement a proportional BW > > controller at higher layer also. But that would practically mean almost > > re-implementing the CFQ logic at higher layer. Now why to get into all > > that complexity. Why not simply make CFQ hiearchical to also handle the > > groups? > > Make CFQ aware of cgroups is very important also. I could be wrong, but > I don't think we shouldn't re-implement the same exact CFQ logic at > higher layers. CFQ dispatches IO requests, at higher layers applications > submit IO requests. We're talking about different things and applying > different logic doesn't sound too strange IMHO. I mean, at least we > should consider/test also this different approach before deciding drop > it. > Lot of CFQ code is all about maintaining per io context queues, for different classes and different prio level, about anticipation for reads etc. Anybody who wants to get classes and ioprio within cgroup right will end up duplicating all that logic (to cover all the cases). So I did not mean that you will end up copying the whole code but logically a lot of it. Secondly, there will be mismatch in anticipation logic. CFQ gives preference to reads and for dependent readers it idles and waits for next request to come. A higher level throttling can interefere with IO pattern of application and can lead CFQ to think that average thinktime of this application is high and disable the anticipation on that application. Which should result in high latencies for simple commands like "ls", in presence of competing applications. > This solution also guarantee no changes in the IO schedulers for those > who are not interested in using the cgroup IO controller. What is the > impact of the IO scheduler based controller for those users? > IO scheduler based solution is highly customizable. First of all there are compile time switches to either completely remove fair queuing code (for noop, deadline and AS only) or to disable group scheduling only. If that's the case one would expect same behavior as old scheduler. Secondly, even if everything is compiled in and customer is not using cgroups, I would expect almost same behavior (because we will have only root group). There will be extra code in the way and we will need some optimizations to detect that there is only one group and bypass as much code as possible bringing the overhead of the new code to the minimum. So if customer is not using IO controller, he should get the same behavior as old system. Can't prove it right now because my patches are not in that matured but there are no fundamental design limitations. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-06 22:17 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 22:17 UTC (permalink / raw) To: Andrea Righi Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Thu, May 07, 2009 at 12:02:51AM +0200, Andrea Righi wrote: > On Wed, May 06, 2009 at 05:21:21PM -0400, Vivek Goyal wrote: > > > Well, IMHO the big concern is at which level we want to implement the > > > logic of control: IO scheduler, when the IO requests are already > > > submitted and need to be dispatched, or at high level when the > > > applications generates IO requests (or maybe both). > > > > > > And, as pointed by Andrew, do everything by a cgroup-based controller. > > > > I am not sure what's the rationale behind that. Why to do it at higher > > layer? Doing it at IO scheduler layer will make sure that one does not > > breaks the IO scheduler's properties with-in cgroup. (See my other mail > > with some io-throttling test results). > > > > The advantage of higher layer mechanism is that it can also cover software > > RAID devices well. > > > > > > > > The other features, proportional BW, throttling, take the current ioprio > > > model in account, etc. are implementation details and any of the > > > proposed solutions can be extended to support all these features. I > > > mean, io-throttle can be extended to support proportional BW (for a > > > certain perspective it is already provided by the throttling water mark > > > in v16), as well as the IO scheduler based controller can be extended to > > > support absolute BW limits. The same for dm-ioband. I don't think > > > there're huge obstacle to merge the functionalities in this sense. > > > > Yes, from technical point of view, one can implement a proportional BW > > controller at higher layer also. But that would practically mean almost > > re-implementing the CFQ logic at higher layer. Now why to get into all > > that complexity. Why not simply make CFQ hiearchical to also handle the > > groups? > > Make CFQ aware of cgroups is very important also. I could be wrong, but > I don't think we shouldn't re-implement the same exact CFQ logic at > higher layers. CFQ dispatches IO requests, at higher layers applications > submit IO requests. We're talking about different things and applying > different logic doesn't sound too strange IMHO. I mean, at least we > should consider/test also this different approach before deciding drop > it. > Lot of CFQ code is all about maintaining per io context queues, for different classes and different prio level, about anticipation for reads etc. Anybody who wants to get classes and ioprio within cgroup right will end up duplicating all that logic (to cover all the cases). So I did not mean that you will end up copying the whole code but logically a lot of it. Secondly, there will be mismatch in anticipation logic. CFQ gives preference to reads and for dependent readers it idles and waits for next request to come. A higher level throttling can interefere with IO pattern of application and can lead CFQ to think that average thinktime of this application is high and disable the anticipation on that application. Which should result in high latencies for simple commands like "ls", in presence of competing applications. > This solution also guarantee no changes in the IO schedulers for those > who are not interested in using the cgroup IO controller. What is the > impact of the IO scheduler based controller for those users? > IO scheduler based solution is highly customizable. First of all there are compile time switches to either completely remove fair queuing code (for noop, deadline and AS only) or to disable group scheduling only. If that's the case one would expect same behavior as old scheduler. Secondly, even if everything is compiled in and customer is not using cgroups, I would expect almost same behavior (because we will have only root group). There will be extra code in the way and we will need some optimizations to detect that there is only one group and bypass as much code as possible bringing the overhead of the new code to the minimum. So if customer is not using IO controller, he should get the same behavior as old system. Can't prove it right now because my patches are not in that matured but there are no fundamental design limitations. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090506023332.GA1212-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506023332.GA1212-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-06 17:59 ` Nauman Rafique 2009-05-06 20:07 ` Andrea Righi ` (2 subsequent siblings) 3 siblings, 0 replies; 97+ messages in thread From: Nauman Rafique @ 2009-05-06 17:59 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w On Tue, May 5, 2009 at 7:33 PM, Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: >> On Tue, 5 May 2009 15:58:27 -0400 >> Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: >> >> > >> > Hi All, >> > >> > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. >> > ... >> > Currently primarily two other IO controller proposals are out there. >> > >> > dm-ioband >> > --------- >> > This patch set is from Ryo Tsuruta from valinux. >> > ... >> > IO-throttling >> > ------------- >> > This patch set is from Andrea Righi provides max bandwidth controller. >> >> I'm thinking we need to lock you guys in a room and come back in 15 minutes. >> >> Seriously, how are we to resolve this? We could lock me in a room and >> cmoe back in 15 days, but there's no reason to believe that I'd emerge >> with the best answer. >> >> I tend to think that a cgroup-based controller is the way to go. >> Anything else will need to be wired up to cgroups _anyway_, and that >> might end up messy. > > Hi Andrew, > > Sorry, did not get what do you mean by cgroup based controller? If you > mean that we use cgroups for grouping tasks for controlling IO, then both > IO scheduler based controller as well as io throttling proposal do that. > dm-ioband also supports that up to some extent but it requires extra step of > transferring cgroup grouping information to dm-ioband device using dm-tools. > > But if you meant that io-throttle patches, then I think it solves only > part of the problem and that is max bw control. It does not offer minimum > BW/minimum disk share gurantees as offered by proportional BW control. > > IOW, it supports upper limit control and does not support a work conserving > IO controller which lets a group use the whole BW if competing groups are > not present. IMHO, proportional BW control is an important feature which > we will need and IIUC, io-throttle patches can't be easily extended to support > proportional BW control, OTOH, one should be able to extend IO scheduler > based proportional weight controller to also support max bw control. > > Andrea, last time you were planning to have a look at my patches and see > if max bw controller can be implemented there. I got a feeling that it > should not be too difficult to implement it there. We already have the > hierarchical tree of io queues and groups in elevator layer and we run > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > just a matter of also keeping track of IO rate per queue/group and we should > be easily be able to delay the dispatch of IO from a queue if its group has > crossed the specified max bw. > > This should lead to less code and reduced complextiy (compared with the > case where we do max bw control with io-throttling patches and proportional > BW control using IO scheduler based control patches). > > So do you think that it would make sense to do max BW control along with > proportional weight IO controller at IO scheduler? If yes, then we can > work together and continue to develop this patchset to also support max > bw control and meet your requirements and drop the io-throttling patches. > > The only thing which concerns me is the fact that IO scheduler does not > have the view of higher level logical device. So if somebody has setup a > software RAID and wants to put max BW limit on software raid device, this > solution will not work. One shall have to live with max bw limits on > individual disks (where io scheduler is actually running). Do your patches > allow to put limit on software RAID devices also? > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > of FIFO dispatch of buffered bios. Apart from that it tries to provide > fairness in terms of actual IO done and that would mean a seeky workload > will can use disk for much longer to get equivalent IO done and slow down > other applications. Implementing IO controller at IO scheduler level gives > us tigher control. Will it not meet your requirements? If you got specific > concerns with IO scheduler based contol patches, please highlight these and > we will see how these can be addressed. In my opinion, IO throttling and dm-ioband are probably simpler, but incomplete solutions to the problem. And for a solution to be complete, it would have to be at a IO scheduler layer so it can do things like taking an IO as soon as it comes and stick it to the front of all the queues so that it can go to the disk right away. This patch set is big, but it takes us in the right direction. Our ultimate goal should be able to reach the level of control that we can have over CPU and network resources. And I don't think IO throttling and dm-ioband approaches take us in that direction. > > Thanks > Vivek > ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506023332.GA1212-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-06 17:59 ` Nauman Rafique @ 2009-05-06 20:07 ` Andrea Righi 2009-05-06 20:32 ` Vivek Goyal 2009-05-07 0:18 ` Ryo Tsuruta 3 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-06 20:07 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Tue, May 05, 2009 at 10:33:32PM -0400, Vivek Goyal wrote: > On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: > > On Tue, 5 May 2009 15:58:27 -0400 > > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > > > Hi All, > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > ... > > > Currently primarily two other IO controller proposals are out there. > > > > > > dm-ioband > > > --------- > > > This patch set is from Ryo Tsuruta from valinux. > > > ... > > > IO-throttling > > > ------------- > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > Seriously, how are we to resolve this? We could lock me in a room and > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > with the best answer. > > > > I tend to think that a cgroup-based controller is the way to go. > > Anything else will need to be wired up to cgroups _anyway_, and that > > might end up messy. > > Hi Andrew, > > Sorry, did not get what do you mean by cgroup based controller? If you > mean that we use cgroups for grouping tasks for controlling IO, then both > IO scheduler based controller as well as io throttling proposal do that. > dm-ioband also supports that up to some extent but it requires extra step of > transferring cgroup grouping information to dm-ioband device using dm-tools. > > But if you meant that io-throttle patches, then I think it solves only > part of the problem and that is max bw control. It does not offer minimum > BW/minimum disk share gurantees as offered by proportional BW control. > > IOW, it supports upper limit control and does not support a work conserving > IO controller which lets a group use the whole BW if competing groups are > not present. IMHO, proportional BW control is an important feature which > we will need and IIUC, io-throttle patches can't be easily extended to support > proportional BW control, OTOH, one should be able to extend IO scheduler > based proportional weight controller to also support max bw control. Well, IMHO the big concern is at which level we want to implement the logic of control: IO scheduler, when the IO requests are already submitted and need to be dispatched, or at high level when the applications generates IO requests (or maybe both). And, as pointed by Andrew, do everything by a cgroup-based controller. The other features, proportional BW, throttling, take the current ioprio model in account, etc. are implementation details and any of the proposed solutions can be extended to support all these features. I mean, io-throttle can be extended to support proportional BW (for a certain perspective it is already provided by the throttling water mark in v16), as well as the IO scheduler based controller can be extended to support absolute BW limits. The same for dm-ioband. I don't think there're huge obstacle to merge the functionalities in this sense. > > Andrea, last time you were planning to have a look at my patches and see > if max bw controller can be implemented there. I got a feeling that it > should not be too difficult to implement it there. We already have the > hierarchical tree of io queues and groups in elevator layer and we run > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > just a matter of also keeping track of IO rate per queue/group and we should > be easily be able to delay the dispatch of IO from a queue if its group has > crossed the specified max bw. Yes, sorry for my late, I quickly tested your patchset, but I still need to understand many details of your solution. In the next days I'll re-read everything carefully and I'll try to do a detailed review of your patchset (just re-building the kernel with your patchset applied). > > This should lead to less code and reduced complextiy (compared with the > case where we do max bw control with io-throttling patches and proportional > BW control using IO scheduler based control patches). mmmh... changing the logic at the elevator and all IO schedulers doesn't sound like reduced complexity and less code changed. With io-throttle we just need to place the cgroup_io_throttle() hook in the right functions where we want to apply throttling. This is a quite easy approach to extend the IO control also to logical devices (more in general devices that use their own make_request_fn) or even network-attached devices, as well as networking filesystems, etc. But I may be wrong. As I said I still need to review in the details your solution. > > So do you think that it would make sense to do max BW control along with > proportional weight IO controller at IO scheduler? If yes, then we can > work together and continue to develop this patchset to also support max > bw control and meet your requirements and drop the io-throttling patches. It is surely worth to be explored. Honestly, I don't know if it would be a better solution or not. Probably comparing some results with different IO workloads is the best way to proceed and decide which is the right way to go. This is necessary IMHO, before totally dropping one solution or another. > > The only thing which concerns me is the fact that IO scheduler does not > have the view of higher level logical device. So if somebody has setup a > software RAID and wants to put max BW limit on software raid device, this > solution will not work. One shall have to live with max bw limits on > individual disks (where io scheduler is actually running). Do your patches > allow to put limit on software RAID devices also? No, but as said above my patchset provides the interfaces to apply the IO control and accounting wherever we want. At the moment there's just one interface, cgroup_io_throttle(). -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506023332.GA1212-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-06 17:59 ` Nauman Rafique 2009-05-06 20:07 ` Andrea Righi @ 2009-05-06 20:32 ` Vivek Goyal 2009-05-07 0:18 ` Ryo Tsuruta 3 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 20:32 UTC (permalink / raw) To: Andrew Morton, Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Tue, May 05, 2009 at 10:33:32PM -0400, Vivek Goyal wrote: > On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: > > On Tue, 5 May 2009 15:58:27 -0400 > > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > > > Hi All, > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > ... > > > Currently primarily two other IO controller proposals are out there. > > > > > > dm-ioband > > > --------- > > > This patch set is from Ryo Tsuruta from valinux. > > > ... > > > IO-throttling > > > ------------- > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > Seriously, how are we to resolve this? We could lock me in a room and > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > with the best answer. > > > > I tend to think that a cgroup-based controller is the way to go. > > Anything else will need to be wired up to cgroups _anyway_, and that > > might end up messy. > > Hi Andrew, > > Sorry, did not get what do you mean by cgroup based controller? If you > mean that we use cgroups for grouping tasks for controlling IO, then both > IO scheduler based controller as well as io throttling proposal do that. > dm-ioband also supports that up to some extent but it requires extra step of > transferring cgroup grouping information to dm-ioband device using dm-tools. > > But if you meant that io-throttle patches, then I think it solves only > part of the problem and that is max bw control. It does not offer minimum > BW/minimum disk share gurantees as offered by proportional BW control. > > IOW, it supports upper limit control and does not support a work conserving > IO controller which lets a group use the whole BW if competing groups are > not present. IMHO, proportional BW control is an important feature which > we will need and IIUC, io-throttle patches can't be easily extended to support > proportional BW control, OTOH, one should be able to extend IO scheduler > based proportional weight controller to also support max bw control. > > Andrea, last time you were planning to have a look at my patches and see > if max bw controller can be implemented there. I got a feeling that it > should not be too difficult to implement it there. We already have the > hierarchical tree of io queues and groups in elevator layer and we run > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > just a matter of also keeping track of IO rate per queue/group and we should > be easily be able to delay the dispatch of IO from a queue if its group has > crossed the specified max bw. > > This should lead to less code and reduced complextiy (compared with the > case where we do max bw control with io-throttling patches and proportional > BW control using IO scheduler based control patches). > > So do you think that it would make sense to do max BW control along with > proportional weight IO controller at IO scheduler? If yes, then we can > work together and continue to develop this patchset to also support max > bw control and meet your requirements and drop the io-throttling patches. > Hi Andrea and others, I always had this doubt in mind that any kind of 2nd level controller will have no idea about underlying IO scheduler queues/semantics. So while it can implement a particular cgroup policy (max bw like io-throttle or proportional bw like dm-ioband) but there are high chances that it will break IO scheduler's semantics in one way or other. I had already sent out the results for dm-ioband in a separate thread. http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html Here are some basic results with io-throttle. Andrea, please let me know if you think this is procedural problem. Playing with io-throttle patches for the first time. I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ scheduler. I have got one SATA drive with one partition on it. I am trying to create one cgroup and assignn 8MB/s limit to it and launch on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided between these tasks. Following are the results. Following is my test script. ******************************************************************* #!/bin/bash mount /dev/sdb1 /mnt/sdb mount -t cgroup -o blockio blockio /cgroup/iot/ mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 # Set bw limit of 8 MB/ps on sdb echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > /cgroup/iot/test1/blockio.bandwidth-max sync echo 3 > /proc/sys/vm/drop_caches echo $$ > /cgroup/iot/test1/tasks # Launch a normal prio reader. ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & pid1=$! echo $pid1 # Launch an RT reader ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & pid2=$! echo $pid2 wait $pid2 echo "RT task finished" ********************************************************************** Test1 ===== Test two readers (one RT class and one BE class) and see how BW is allocated with-in cgroup With io-throttle patches ------------------------ - Two readers, first BE prio 7, second RT prio 0 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s RT task finished Note: See, there is no difference in the performance of RT or BE task. Looks like these got throttled equally. Without io-throttle patches ---------------------------- - Two readers, first BE prio 7, second RT prio 0 234179072 bytes (234 MB) copied, 2.81801 s, 83.1 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.28238 s, 44.3 MB/s Note: Because I can't limit the BW without io-throttle patches, so don't worry about increased BW. But the important point is that RT task gets much more BW than a BE prio 7 task. Test2 ==== - Test 2 readers (One BE prio 0 and one BE prio 7) and see how BW is distributed among these. With io-throttle patches ------------------------ - Two readers, first BE prio 7, second BE prio 0 234179072 bytes (234 MB) copied, 55.8604 s, 4.2 MB/s 234179072 bytes (234 MB) copied, 55.8918 s, 4.2 MB/s High prio reader finished Without io-throttle patches --------------------------- - Two readers, first BE prio 7, second BE prio 0 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s High prio reader finished 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s Note: There is no service differentiation between prio 0 and prio 7 task with io-throttle patches. Test 3 ====== - Run the one RT reader and one BE reader in root cgroup without any limitations. I guess this should mean unlimited BW and behavior should be same as with CFQ without io-throttling patches. With io-throttle patches ========================= Ran the test 4 times because I was getting different results in different runs. - Two readers, one RT prio 0 other BE prio 7 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s RT task finished 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s RT task finished Note: Out of 4 runs, looks like twice it is complete priority inversion and RT task finished after BE task. Rest of the two times, the difference between BW of RT and BE task is much less as compared to without patches. In fact once it was almost same. Without io-throttle patches. =========================== - Two readers, one RT prio 0 other BE prio 7 (4 runs) 234179072 bytes (234 MB) copied, 2.80988 s, 83.3 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.28228 s, 44.3 MB/s 234179072 bytes (234 MB) copied, 2.80659 s, 83.4 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.27874 s, 44.4 MB/s 234179072 bytes (234 MB) copied, 2.79601 s, 83.8 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.2542 s, 44.6 MB/s 234179072 bytes (234 MB) copied, 2.78764 s, 84.0 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.26009 s, 44.5 MB/s Note, How consistent the behavior is without io-throttle patches. In summary, I think a 2nd level solution can ensure one policy on cgroups but it will break other semantics/properties of IO scheduler with-in cgroup as 2nd level solution has no idea at run time what is the IO scheduler running underneath and what kind of properties it has. Andrea, please try it on your setup and see if you get similar results on or. Hopefully it is not a configuration or test procedure issue on my side. Thanks Vivek > The only thing which concerns me is the fact that IO scheduler does not > have the view of higher level logical device. So if somebody has setup a > software RAID and wants to put max BW limit on software raid device, this > solution will not work. One shall have to live with max bw limits on > individual disks (where io scheduler is actually running). Do your patches > allow to put limit on software RAID devices also? > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > of FIFO dispatch of buffered bios. Apart from that it tries to provide > fairness in terms of actual IO done and that would mean a seeky workload > will can use disk for much longer to get equivalent IO done and slow down > other applications. Implementing IO controller at IO scheduler level gives > us tigher control. Will it not meet your requirements? If you got specific > concerns with IO scheduler based contol patches, please highlight these and > we will see how these can be addressed. > > Thanks > Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506023332.GA1212-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> ` (2 preceding siblings ...) 2009-05-06 20:32 ` Vivek Goyal @ 2009-05-07 0:18 ` Ryo Tsuruta 3 siblings, 0 replies; 97+ messages in thread From: Ryo Tsuruta @ 2009-05-07 0:18 UTC (permalink / raw) To: vgoyal-H+wXaHxf7aLQT0dZR+AlfA Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w Hi Vivek, > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > of FIFO dispatch of buffered bios. Apart from that it tries to provide > fairness in terms of actual IO done and that would mean a seeky workload > will can use disk for much longer to get equivalent IO done and slow down > other applications. Implementing IO controller at IO scheduler level gives > us tigher control. Will it not meet your requirements? If you got specific > concerns with IO scheduler based contol patches, please highlight these and > we will see how these can be addressed. I'd like to avoid making complicated existing IO schedulers and other kernel codes and to give a choice to users whether or not to use it. I know that you chose an approach that using compile time options to get the same behavior as old system, but device-mapper drivers can be added, removed and replaced while system is running. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 2:33 ` Vivek Goyal ` (2 preceding siblings ...) [not found] ` <20090506023332.GA1212-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-06 20:32 ` Vivek Goyal [not found] ` <20090506203228.GH8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-06 21:34 ` Andrea Righi 2009-05-07 0:18 ` Ryo Tsuruta 4 siblings, 2 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 20:32 UTC (permalink / raw) To: Andrew Morton, Andrea Righi Cc: nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Tue, May 05, 2009 at 10:33:32PM -0400, Vivek Goyal wrote: > On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: > > On Tue, 5 May 2009 15:58:27 -0400 > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > Hi All, > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > ... > > > Currently primarily two other IO controller proposals are out there. > > > > > > dm-ioband > > > --------- > > > This patch set is from Ryo Tsuruta from valinux. > > > ... > > > IO-throttling > > > ------------- > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > Seriously, how are we to resolve this? We could lock me in a room and > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > with the best answer. > > > > I tend to think that a cgroup-based controller is the way to go. > > Anything else will need to be wired up to cgroups _anyway_, and that > > might end up messy. > > Hi Andrew, > > Sorry, did not get what do you mean by cgroup based controller? If you > mean that we use cgroups for grouping tasks for controlling IO, then both > IO scheduler based controller as well as io throttling proposal do that. > dm-ioband also supports that up to some extent but it requires extra step of > transferring cgroup grouping information to dm-ioband device using dm-tools. > > But if you meant that io-throttle patches, then I think it solves only > part of the problem and that is max bw control. It does not offer minimum > BW/minimum disk share gurantees as offered by proportional BW control. > > IOW, it supports upper limit control and does not support a work conserving > IO controller which lets a group use the whole BW if competing groups are > not present. IMHO, proportional BW control is an important feature which > we will need and IIUC, io-throttle patches can't be easily extended to support > proportional BW control, OTOH, one should be able to extend IO scheduler > based proportional weight controller to also support max bw control. > > Andrea, last time you were planning to have a look at my patches and see > if max bw controller can be implemented there. I got a feeling that it > should not be too difficult to implement it there. We already have the > hierarchical tree of io queues and groups in elevator layer and we run > BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is > just a matter of also keeping track of IO rate per queue/group and we should > be easily be able to delay the dispatch of IO from a queue if its group has > crossed the specified max bw. > > This should lead to less code and reduced complextiy (compared with the > case where we do max bw control with io-throttling patches and proportional > BW control using IO scheduler based control patches). > > So do you think that it would make sense to do max BW control along with > proportional weight IO controller at IO scheduler? If yes, then we can > work together and continue to develop this patchset to also support max > bw control and meet your requirements and drop the io-throttling patches. > Hi Andrea and others, I always had this doubt in mind that any kind of 2nd level controller will have no idea about underlying IO scheduler queues/semantics. So while it can implement a particular cgroup policy (max bw like io-throttle or proportional bw like dm-ioband) but there are high chances that it will break IO scheduler's semantics in one way or other. I had already sent out the results for dm-ioband in a separate thread. http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html Here are some basic results with io-throttle. Andrea, please let me know if you think this is procedural problem. Playing with io-throttle patches for the first time. I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ scheduler. I have got one SATA drive with one partition on it. I am trying to create one cgroup and assignn 8MB/s limit to it and launch on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided between these tasks. Following are the results. Following is my test script. ******************************************************************* #!/bin/bash mount /dev/sdb1 /mnt/sdb mount -t cgroup -o blockio blockio /cgroup/iot/ mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 # Set bw limit of 8 MB/ps on sdb echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > /cgroup/iot/test1/blockio.bandwidth-max sync echo 3 > /proc/sys/vm/drop_caches echo $$ > /cgroup/iot/test1/tasks # Launch a normal prio reader. ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & pid1=$! echo $pid1 # Launch an RT reader ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & pid2=$! echo $pid2 wait $pid2 echo "RT task finished" ********************************************************************** Test1 ===== Test two readers (one RT class and one BE class) and see how BW is allocated with-in cgroup With io-throttle patches ------------------------ - Two readers, first BE prio 7, second RT prio 0 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s RT task finished Note: See, there is no difference in the performance of RT or BE task. Looks like these got throttled equally. Without io-throttle patches ---------------------------- - Two readers, first BE prio 7, second RT prio 0 234179072 bytes (234 MB) copied, 2.81801 s, 83.1 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.28238 s, 44.3 MB/s Note: Because I can't limit the BW without io-throttle patches, so don't worry about increased BW. But the important point is that RT task gets much more BW than a BE prio 7 task. Test2 ==== - Test 2 readers (One BE prio 0 and one BE prio 7) and see how BW is distributed among these. With io-throttle patches ------------------------ - Two readers, first BE prio 7, second BE prio 0 234179072 bytes (234 MB) copied, 55.8604 s, 4.2 MB/s 234179072 bytes (234 MB) copied, 55.8918 s, 4.2 MB/s High prio reader finished Without io-throttle patches --------------------------- - Two readers, first BE prio 7, second BE prio 0 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s High prio reader finished 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s Note: There is no service differentiation between prio 0 and prio 7 task with io-throttle patches. Test 3 ====== - Run the one RT reader and one BE reader in root cgroup without any limitations. I guess this should mean unlimited BW and behavior should be same as with CFQ without io-throttling patches. With io-throttle patches ========================= Ran the test 4 times because I was getting different results in different runs. - Two readers, one RT prio 0 other BE prio 7 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s RT task finished 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s RT task finished Note: Out of 4 runs, looks like twice it is complete priority inversion and RT task finished after BE task. Rest of the two times, the difference between BW of RT and BE task is much less as compared to without patches. In fact once it was almost same. Without io-throttle patches. =========================== - Two readers, one RT prio 0 other BE prio 7 (4 runs) 234179072 bytes (234 MB) copied, 2.80988 s, 83.3 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.28228 s, 44.3 MB/s 234179072 bytes (234 MB) copied, 2.80659 s, 83.4 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.27874 s, 44.4 MB/s 234179072 bytes (234 MB) copied, 2.79601 s, 83.8 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.2542 s, 44.6 MB/s 234179072 bytes (234 MB) copied, 2.78764 s, 84.0 MB/s RT task finished 234179072 bytes (234 MB) copied, 5.26009 s, 44.5 MB/s Note, How consistent the behavior is without io-throttle patches. In summary, I think a 2nd level solution can ensure one policy on cgroups but it will break other semantics/properties of IO scheduler with-in cgroup as 2nd level solution has no idea at run time what is the IO scheduler running underneath and what kind of properties it has. Andrea, please try it on your setup and see if you get similar results on or. Hopefully it is not a configuration or test procedure issue on my side. Thanks Vivek > The only thing which concerns me is the fact that IO scheduler does not > have the view of higher level logical device. So if somebody has setup a > software RAID and wants to put max BW limit on software raid device, this > solution will not work. One shall have to live with max bw limits on > individual disks (where io scheduler is actually running). Do your patches > allow to put limit on software RAID devices also? > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > of FIFO dispatch of buffered bios. Apart from that it tries to provide > fairness in terms of actual IO done and that would mean a seeky workload > will can use disk for much longer to get equivalent IO done and slow down > other applications. Implementing IO controller at IO scheduler level gives > us tigher control. Will it not meet your requirements? If you got specific > concerns with IO scheduler based contol patches, please highlight these and > we will see how these can be addressed. > > Thanks > Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090506203228.GH8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506203228.GH8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-06 21:34 ` Andrea Righi 0 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-06 21:34 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Wed, May 06, 2009 at 04:32:28PM -0400, Vivek Goyal wrote: > Hi Andrea and others, > > I always had this doubt in mind that any kind of 2nd level controller will > have no idea about underlying IO scheduler queues/semantics. So while it > can implement a particular cgroup policy (max bw like io-throttle or > proportional bw like dm-ioband) but there are high chances that it will > break IO scheduler's semantics in one way or other. > > I had already sent out the results for dm-ioband in a separate thread. > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html > > Here are some basic results with io-throttle. Andrea, please let me know > if you think this is procedural problem. Playing with io-throttle patches > for the first time. > > I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ > scheduler. > > I have got one SATA drive with one partition on it. > > I am trying to create one cgroup and assignn 8MB/s limit to it and launch > on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided > between these tasks. Following are the results. > > Following is my test script. > > ******************************************************************* > #!/bin/bash > > mount /dev/sdb1 /mnt/sdb > > mount -t cgroup -o blockio blockio /cgroup/iot/ > mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 > > # Set bw limit of 8 MB/ps on sdb > echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > > /cgroup/iot/test1/blockio.bandwidth-max > > sync > echo 3 > /proc/sys/vm/drop_caches > > echo $$ > /cgroup/iot/test1/tasks > > # Launch a normal prio reader. > ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & > pid1=$! > echo $pid1 > > # Launch an RT reader > ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > pid2=$! > echo $pid2 > > wait $pid2 > echo "RT task finished" > ********************************************************************** > > Test1 > ===== > Test two readers (one RT class and one BE class) and see how BW is > allocated with-in cgroup > > With io-throttle patches > ------------------------ > - Two readers, first BE prio 7, second RT prio 0 > > 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s > 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s > RT task finished > > Note: See, there is no difference in the performance of RT or BE task. > Looks like these got throttled equally. OK, this is coherent with the current io-throttle implementation. IO requests are throttled without the concept of the ioprio model. We could try to distribute the throttle using a function of each task's ioprio, but ok, the obvious drawback is that it totally breaks the logic used by the underlying layers. BTW, I'm wondering, is it a very critical issue? I would say why not to move the RT task to a different cgroup with unlimited BW? or limited BW but with other tasks running at the same IO priority... could the cgroup subsystem be a more flexible and customizable framework respect to the current ioprio model? I'm not saying we have to ignore the problem, just trying to evaluate the impact and alternatives. And I'm still convinced that also providing per-cgroup ioprio would be an important feature. > > > Without io-throttle patches > ---------------------------- > - Two readers, first BE prio 7, second RT prio 0 > > 234179072 bytes (234 MB) copied, 2.81801 s, 83.1 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.28238 s, 44.3 MB/s > > Note: Because I can't limit the BW without io-throttle patches, so don't > worry about increased BW. But the important point is that RT task > gets much more BW than a BE prio 7 task. > > Test2 > ==== > - Test 2 readers (One BE prio 0 and one BE prio 7) and see how BW is > distributed among these. > > With io-throttle patches > ------------------------ > - Two readers, first BE prio 7, second BE prio 0 > > 234179072 bytes (234 MB) copied, 55.8604 s, 4.2 MB/s > 234179072 bytes (234 MB) copied, 55.8918 s, 4.2 MB/s > High prio reader finished Ditto. > > Without io-throttle patches > --------------------------- > - Two readers, first BE prio 7, second BE prio 0 > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > High prio reader finished > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > Note: There is no service differentiation between prio 0 and prio 7 task > with io-throttle patches. > > Test 3 > ====== > - Run the one RT reader and one BE reader in root cgroup without any > limitations. I guess this should mean unlimited BW and behavior should > be same as with CFQ without io-throttling patches. > > With io-throttle patches > ========================= > Ran the test 4 times because I was getting different results in different > runs. > > - Two readers, one RT prio 0 other BE prio 7 > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > RT task finished > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > RT task finished > > Note: Out of 4 runs, looks like twice it is complete priority inversion > and RT task finished after BE task. Rest of the two times, the > difference between BW of RT and BE task is much less as compared to > without patches. In fact once it was almost same. This is strange. If you don't set any limit there shouldn't be any difference respect to the other case (without io-throttle patches). At worst a small overhead given by the task_to_iothrottle(), under rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to reproduce this strange behaviour. > > Without io-throttle patches. > =========================== > - Two readers, one RT prio 0 other BE prio 7 (4 runs) > > 234179072 bytes (234 MB) copied, 2.80988 s, 83.3 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.28228 s, 44.3 MB/s > > 234179072 bytes (234 MB) copied, 2.80659 s, 83.4 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.27874 s, 44.4 MB/s > > 234179072 bytes (234 MB) copied, 2.79601 s, 83.8 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.2542 s, 44.6 MB/s > > 234179072 bytes (234 MB) copied, 2.78764 s, 84.0 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.26009 s, 44.5 MB/s > > Note, How consistent the behavior is without io-throttle patches. > > In summary, I think a 2nd level solution can ensure one policy on cgroups but > it will break other semantics/properties of IO scheduler with-in cgroup as > 2nd level solution has no idea at run time what is the IO scheduler running > underneath and what kind of properties it has. > > Andrea, please try it on your setup and see if you get similar results > on or. Hopefully it is not a configuration or test procedure issue on my > side. > > Thanks > Vivek > > > The only thing which concerns me is the fact that IO scheduler does not > > have the view of higher level logical device. So if somebody has setup a > > software RAID and wants to put max BW limit on software raid device, this > > solution will not work. One shall have to live with max bw limits on > > individual disks (where io scheduler is actually running). Do your patches > > allow to put limit on software RAID devices also? > > > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > > of FIFO dispatch of buffered bios. Apart from that it tries to provide > > fairness in terms of actual IO done and that would mean a seeky workload > > will can use disk for much longer to get equivalent IO done and slow down > > other applications. Implementing IO controller at IO scheduler level gives > > us tigher control. Will it not meet your requirements? If you got specific > > concerns with IO scheduler based contol patches, please highlight these and > > we will see how these can be addressed. > > > > Thanks > > Vivek -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 20:32 ` Vivek Goyal [not found] ` <20090506203228.GH8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-06 21:34 ` Andrea Righi 2009-05-06 21:52 ` Vivek Goyal 1 sibling, 1 reply; 97+ messages in thread From: Andrea Righi @ 2009-05-06 21:34 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Wed, May 06, 2009 at 04:32:28PM -0400, Vivek Goyal wrote: > Hi Andrea and others, > > I always had this doubt in mind that any kind of 2nd level controller will > have no idea about underlying IO scheduler queues/semantics. So while it > can implement a particular cgroup policy (max bw like io-throttle or > proportional bw like dm-ioband) but there are high chances that it will > break IO scheduler's semantics in one way or other. > > I had already sent out the results for dm-ioband in a separate thread. > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html > > Here are some basic results with io-throttle. Andrea, please let me know > if you think this is procedural problem. Playing with io-throttle patches > for the first time. > > I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ > scheduler. > > I have got one SATA drive with one partition on it. > > I am trying to create one cgroup and assignn 8MB/s limit to it and launch > on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided > between these tasks. Following are the results. > > Following is my test script. > > ******************************************************************* > #!/bin/bash > > mount /dev/sdb1 /mnt/sdb > > mount -t cgroup -o blockio blockio /cgroup/iot/ > mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 > > # Set bw limit of 8 MB/ps on sdb > echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > > /cgroup/iot/test1/blockio.bandwidth-max > > sync > echo 3 > /proc/sys/vm/drop_caches > > echo $$ > /cgroup/iot/test1/tasks > > # Launch a normal prio reader. > ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & > pid1=$! > echo $pid1 > > # Launch an RT reader > ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > pid2=$! > echo $pid2 > > wait $pid2 > echo "RT task finished" > ********************************************************************** > > Test1 > ===== > Test two readers (one RT class and one BE class) and see how BW is > allocated with-in cgroup > > With io-throttle patches > ------------------------ > - Two readers, first BE prio 7, second RT prio 0 > > 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s > 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s > RT task finished > > Note: See, there is no difference in the performance of RT or BE task. > Looks like these got throttled equally. OK, this is coherent with the current io-throttle implementation. IO requests are throttled without the concept of the ioprio model. We could try to distribute the throttle using a function of each task's ioprio, but ok, the obvious drawback is that it totally breaks the logic used by the underlying layers. BTW, I'm wondering, is it a very critical issue? I would say why not to move the RT task to a different cgroup with unlimited BW? or limited BW but with other tasks running at the same IO priority... could the cgroup subsystem be a more flexible and customizable framework respect to the current ioprio model? I'm not saying we have to ignore the problem, just trying to evaluate the impact and alternatives. And I'm still convinced that also providing per-cgroup ioprio would be an important feature. > > > Without io-throttle patches > ---------------------------- > - Two readers, first BE prio 7, second RT prio 0 > > 234179072 bytes (234 MB) copied, 2.81801 s, 83.1 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.28238 s, 44.3 MB/s > > Note: Because I can't limit the BW without io-throttle patches, so don't > worry about increased BW. But the important point is that RT task > gets much more BW than a BE prio 7 task. > > Test2 > ==== > - Test 2 readers (One BE prio 0 and one BE prio 7) and see how BW is > distributed among these. > > With io-throttle patches > ------------------------ > - Two readers, first BE prio 7, second BE prio 0 > > 234179072 bytes (234 MB) copied, 55.8604 s, 4.2 MB/s > 234179072 bytes (234 MB) copied, 55.8918 s, 4.2 MB/s > High prio reader finished Ditto. > > Without io-throttle patches > --------------------------- > - Two readers, first BE prio 7, second BE prio 0 > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > High prio reader finished > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > Note: There is no service differentiation between prio 0 and prio 7 task > with io-throttle patches. > > Test 3 > ====== > - Run the one RT reader and one BE reader in root cgroup without any > limitations. I guess this should mean unlimited BW and behavior should > be same as with CFQ without io-throttling patches. > > With io-throttle patches > ========================= > Ran the test 4 times because I was getting different results in different > runs. > > - Two readers, one RT prio 0 other BE prio 7 > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > RT task finished > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > RT task finished > > Note: Out of 4 runs, looks like twice it is complete priority inversion > and RT task finished after BE task. Rest of the two times, the > difference between BW of RT and BE task is much less as compared to > without patches. In fact once it was almost same. This is strange. If you don't set any limit there shouldn't be any difference respect to the other case (without io-throttle patches). At worst a small overhead given by the task_to_iothrottle(), under rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to reproduce this strange behaviour. > > Without io-throttle patches. > =========================== > - Two readers, one RT prio 0 other BE prio 7 (4 runs) > > 234179072 bytes (234 MB) copied, 2.80988 s, 83.3 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.28228 s, 44.3 MB/s > > 234179072 bytes (234 MB) copied, 2.80659 s, 83.4 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.27874 s, 44.4 MB/s > > 234179072 bytes (234 MB) copied, 2.79601 s, 83.8 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.2542 s, 44.6 MB/s > > 234179072 bytes (234 MB) copied, 2.78764 s, 84.0 MB/s > RT task finished > 234179072 bytes (234 MB) copied, 5.26009 s, 44.5 MB/s > > Note, How consistent the behavior is without io-throttle patches. > > In summary, I think a 2nd level solution can ensure one policy on cgroups but > it will break other semantics/properties of IO scheduler with-in cgroup as > 2nd level solution has no idea at run time what is the IO scheduler running > underneath and what kind of properties it has. > > Andrea, please try it on your setup and see if you get similar results > on or. Hopefully it is not a configuration or test procedure issue on my > side. > > Thanks > Vivek > > > The only thing which concerns me is the fact that IO scheduler does not > > have the view of higher level logical device. So if somebody has setup a > > software RAID and wants to put max BW limit on software raid device, this > > solution will not work. One shall have to live with max bw limits on > > individual disks (where io scheduler is actually running). Do your patches > > allow to put limit on software RAID devices also? > > > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > > of FIFO dispatch of buffered bios. Apart from that it tries to provide > > fairness in terms of actual IO done and that would mean a seeky workload > > will can use disk for much longer to get equivalent IO done and slow down > > other applications. Implementing IO controller at IO scheduler level gives > > us tigher control. Will it not meet your requirements? If you got specific > > concerns with IO scheduler based contol patches, please highlight these and > > we will see how these can be addressed. > > > > Thanks > > Vivek -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 21:34 ` Andrea Righi @ 2009-05-06 21:52 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 21:52 UTC (permalink / raw) To: Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Wed, May 06, 2009 at 11:34:54PM +0200, Andrea Righi wrote: > On Wed, May 06, 2009 at 04:32:28PM -0400, Vivek Goyal wrote: > > Hi Andrea and others, > > > > I always had this doubt in mind that any kind of 2nd level controller will > > have no idea about underlying IO scheduler queues/semantics. So while it > > can implement a particular cgroup policy (max bw like io-throttle or > > proportional bw like dm-ioband) but there are high chances that it will > > break IO scheduler's semantics in one way or other. > > > > I had already sent out the results for dm-ioband in a separate thread. > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html > > > > Here are some basic results with io-throttle. Andrea, please let me know > > if you think this is procedural problem. Playing with io-throttle patches > > for the first time. > > > > I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ > > scheduler. > > > > I have got one SATA drive with one partition on it. > > > > I am trying to create one cgroup and assignn 8MB/s limit to it and launch > > on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided > > between these tasks. Following are the results. > > > > Following is my test script. > > > > ******************************************************************* > > #!/bin/bash > > > > mount /dev/sdb1 /mnt/sdb > > > > mount -t cgroup -o blockio blockio /cgroup/iot/ > > mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 > > > > # Set bw limit of 8 MB/ps on sdb > > echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > > > /cgroup/iot/test1/blockio.bandwidth-max > > > > sync > > echo 3 > /proc/sys/vm/drop_caches > > > > echo $$ > /cgroup/iot/test1/tasks > > > > # Launch a normal prio reader. > > ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & > > pid1=$! > > echo $pid1 > > > > # Launch an RT reader > > ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > > pid2=$! > > echo $pid2 > > > > wait $pid2 > > echo "RT task finished" > > ********************************************************************** > > > > Test1 > > ===== > > Test two readers (one RT class and one BE class) and see how BW is > > allocated with-in cgroup > > > > With io-throttle patches > > ------------------------ > > - Two readers, first BE prio 7, second RT prio 0 > > > > 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s > > 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s > > RT task finished > > > > Note: See, there is no difference in the performance of RT or BE task. > > Looks like these got throttled equally. > > OK, this is coherent with the current io-throttle implementation. IO > requests are throttled without the concept of the ioprio model. > > We could try to distribute the throttle using a function of each task's > ioprio, but ok, the obvious drawback is that it totally breaks the logic > used by the underlying layers. > > BTW, I'm wondering, is it a very critical issue? I would say why not to > move the RT task to a different cgroup with unlimited BW? or limited BW > but with other tasks running at the same IO priority... So one of hypothetical use case probably could be following. Somebody is having a hosted server and customers are going to get there applications running in a particular cgroup with a limit on max bw. root / | \ cust1 cust2 cust3 (20 MB/s) (40MB/s) (30MB/s) Now all three customers will run their own applications/virtual machines in their respective groups with upper limits. Will we say to these that all your tasks will be considered as same class and same prio level. Assume cust1 is running a hypothetical application which creates multiple threads and assigns these threads different priorities based on its needs at run time. How would we handle this thing? You can't collect all the RT tasks from all customers and move these to a single cgroup. Or ask customers to separate out their tasks based on priority level and give them multiple groups of different priority. > could the cgroup > subsystem be a more flexible and customizable framework respect to the > current ioprio model? > > I'm not saying we have to ignore the problem, just trying to evaluate > the impact and alternatives. And I'm still convinced that also providing > per-cgroup ioprio would be an important feature. > > > > > > > Without io-throttle patches > > ---------------------------- > > - Two readers, first BE prio 7, second RT prio 0 > > > > 234179072 bytes (234 MB) copied, 2.81801 s, 83.1 MB/s > > RT task finished > > 234179072 bytes (234 MB) copied, 5.28238 s, 44.3 MB/s > > > > Note: Because I can't limit the BW without io-throttle patches, so don't > > worry about increased BW. But the important point is that RT task > > gets much more BW than a BE prio 7 task. > > > > Test2 > > ==== > > - Test 2 readers (One BE prio 0 and one BE prio 7) and see how BW is > > distributed among these. > > > > With io-throttle patches > > ------------------------ > > - Two readers, first BE prio 7, second BE prio 0 > > > > 234179072 bytes (234 MB) copied, 55.8604 s, 4.2 MB/s > > 234179072 bytes (234 MB) copied, 55.8918 s, 4.2 MB/s > > High prio reader finished > > Ditto. > > > > > Without io-throttle patches > > --------------------------- > > - Two readers, first BE prio 7, second BE prio 0 > > > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > > High prio reader finished > > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > > > Note: There is no service differentiation between prio 0 and prio 7 task > > with io-throttle patches. > > > > Test 3 > > ====== > > - Run the one RT reader and one BE reader in root cgroup without any > > limitations. I guess this should mean unlimited BW and behavior should > > be same as with CFQ without io-throttling patches. > > > > With io-throttle patches > > ========================= > > Ran the test 4 times because I was getting different results in different > > runs. > > > > - Two readers, one RT prio 0 other BE prio 7 > > > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > > RT task finished > > > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > > RT task finished > > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > > RT task finished > > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > > RT task finished > > > > Note: Out of 4 runs, looks like twice it is complete priority inversion > > and RT task finished after BE task. Rest of the two times, the > > difference between BW of RT and BE task is much less as compared to > > without patches. In fact once it was almost same. > > This is strange. If you don't set any limit there shouldn't be any > difference respect to the other case (without io-throttle patches). > > At worst a small overhead given by the task_to_iothrottle(), under > rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to > reproduce this strange behaviour. Ya, I also found this strange. At least in root group there should not be any behavior change (at max one might expect little drop in throughput because of extra code). Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-06 21:52 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 21:52 UTC (permalink / raw) To: Andrea Righi Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Wed, May 06, 2009 at 11:34:54PM +0200, Andrea Righi wrote: > On Wed, May 06, 2009 at 04:32:28PM -0400, Vivek Goyal wrote: > > Hi Andrea and others, > > > > I always had this doubt in mind that any kind of 2nd level controller will > > have no idea about underlying IO scheduler queues/semantics. So while it > > can implement a particular cgroup policy (max bw like io-throttle or > > proportional bw like dm-ioband) but there are high chances that it will > > break IO scheduler's semantics in one way or other. > > > > I had already sent out the results for dm-ioband in a separate thread. > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html > > > > Here are some basic results with io-throttle. Andrea, please let me know > > if you think this is procedural problem. Playing with io-throttle patches > > for the first time. > > > > I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ > > scheduler. > > > > I have got one SATA drive with one partition on it. > > > > I am trying to create one cgroup and assignn 8MB/s limit to it and launch > > on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided > > between these tasks. Following are the results. > > > > Following is my test script. > > > > ******************************************************************* > > #!/bin/bash > > > > mount /dev/sdb1 /mnt/sdb > > > > mount -t cgroup -o blockio blockio /cgroup/iot/ > > mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 > > > > # Set bw limit of 8 MB/ps on sdb > > echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > > > /cgroup/iot/test1/blockio.bandwidth-max > > > > sync > > echo 3 > /proc/sys/vm/drop_caches > > > > echo $$ > /cgroup/iot/test1/tasks > > > > # Launch a normal prio reader. > > ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & > > pid1=$! > > echo $pid1 > > > > # Launch an RT reader > > ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > > pid2=$! > > echo $pid2 > > > > wait $pid2 > > echo "RT task finished" > > ********************************************************************** > > > > Test1 > > ===== > > Test two readers (one RT class and one BE class) and see how BW is > > allocated with-in cgroup > > > > With io-throttle patches > > ------------------------ > > - Two readers, first BE prio 7, second RT prio 0 > > > > 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s > > 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s > > RT task finished > > > > Note: See, there is no difference in the performance of RT or BE task. > > Looks like these got throttled equally. > > OK, this is coherent with the current io-throttle implementation. IO > requests are throttled without the concept of the ioprio model. > > We could try to distribute the throttle using a function of each task's > ioprio, but ok, the obvious drawback is that it totally breaks the logic > used by the underlying layers. > > BTW, I'm wondering, is it a very critical issue? I would say why not to > move the RT task to a different cgroup with unlimited BW? or limited BW > but with other tasks running at the same IO priority... So one of hypothetical use case probably could be following. Somebody is having a hosted server and customers are going to get there applications running in a particular cgroup with a limit on max bw. root / | \ cust1 cust2 cust3 (20 MB/s) (40MB/s) (30MB/s) Now all three customers will run their own applications/virtual machines in their respective groups with upper limits. Will we say to these that all your tasks will be considered as same class and same prio level. Assume cust1 is running a hypothetical application which creates multiple threads and assigns these threads different priorities based on its needs at run time. How would we handle this thing? You can't collect all the RT tasks from all customers and move these to a single cgroup. Or ask customers to separate out their tasks based on priority level and give them multiple groups of different priority. > could the cgroup > subsystem be a more flexible and customizable framework respect to the > current ioprio model? > > I'm not saying we have to ignore the problem, just trying to evaluate > the impact and alternatives. And I'm still convinced that also providing > per-cgroup ioprio would be an important feature. > > > > > > > Without io-throttle patches > > ---------------------------- > > - Two readers, first BE prio 7, second RT prio 0 > > > > 234179072 bytes (234 MB) copied, 2.81801 s, 83.1 MB/s > > RT task finished > > 234179072 bytes (234 MB) copied, 5.28238 s, 44.3 MB/s > > > > Note: Because I can't limit the BW without io-throttle patches, so don't > > worry about increased BW. But the important point is that RT task > > gets much more BW than a BE prio 7 task. > > > > Test2 > > ==== > > - Test 2 readers (One BE prio 0 and one BE prio 7) and see how BW is > > distributed among these. > > > > With io-throttle patches > > ------------------------ > > - Two readers, first BE prio 7, second BE prio 0 > > > > 234179072 bytes (234 MB) copied, 55.8604 s, 4.2 MB/s > > 234179072 bytes (234 MB) copied, 55.8918 s, 4.2 MB/s > > High prio reader finished > > Ditto. > > > > > Without io-throttle patches > > --------------------------- > > - Two readers, first BE prio 7, second BE prio 0 > > > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > > High prio reader finished > > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > > > Note: There is no service differentiation between prio 0 and prio 7 task > > with io-throttle patches. > > > > Test 3 > > ====== > > - Run the one RT reader and one BE reader in root cgroup without any > > limitations. I guess this should mean unlimited BW and behavior should > > be same as with CFQ without io-throttling patches. > > > > With io-throttle patches > > ========================= > > Ran the test 4 times because I was getting different results in different > > runs. > > > > - Two readers, one RT prio 0 other BE prio 7 > > > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > > RT task finished > > > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > > RT task finished > > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > > RT task finished > > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > > RT task finished > > > > Note: Out of 4 runs, looks like twice it is complete priority inversion > > and RT task finished after BE task. Rest of the two times, the > > difference between BW of RT and BE task is much less as compared to > > without patches. In fact once it was almost same. > > This is strange. If you don't set any limit there shouldn't be any > difference respect to the other case (without io-throttle patches). > > At worst a small overhead given by the task_to_iothrottle(), under > rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to > reproduce this strange behaviour. Ya, I also found this strange. At least in root group there should not be any behavior change (at max one might expect little drop in throughput because of extra code). Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 21:52 ` Vivek Goyal (?) @ 2009-05-06 22:35 ` Andrea Righi 2009-05-07 1:48 ` Ryo Tsuruta 2009-05-07 1:48 ` Ryo Tsuruta -1 siblings, 2 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-06 22:35 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > On Wed, May 06, 2009 at 11:34:54PM +0200, Andrea Righi wrote: > > On Wed, May 06, 2009 at 04:32:28PM -0400, Vivek Goyal wrote: > > > Hi Andrea and others, > > > > > > I always had this doubt in mind that any kind of 2nd level controller will > > > have no idea about underlying IO scheduler queues/semantics. So while it > > > can implement a particular cgroup policy (max bw like io-throttle or > > > proportional bw like dm-ioband) but there are high chances that it will > > > break IO scheduler's semantics in one way or other. > > > > > > I had already sent out the results for dm-ioband in a separate thread. > > > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html > > > > > > Here are some basic results with io-throttle. Andrea, please let me know > > > if you think this is procedural problem. Playing with io-throttle patches > > > for the first time. > > > > > > I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ > > > scheduler. > > > > > > I have got one SATA drive with one partition on it. > > > > > > I am trying to create one cgroup and assignn 8MB/s limit to it and launch > > > on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided > > > between these tasks. Following are the results. > > > > > > Following is my test script. > > > > > > ******************************************************************* > > > #!/bin/bash > > > > > > mount /dev/sdb1 /mnt/sdb > > > > > > mount -t cgroup -o blockio blockio /cgroup/iot/ > > > mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 > > > > > > # Set bw limit of 8 MB/ps on sdb > > > echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > > > > /cgroup/iot/test1/blockio.bandwidth-max > > > > > > sync > > > echo 3 > /proc/sys/vm/drop_caches > > > > > > echo $$ > /cgroup/iot/test1/tasks > > > > > > # Launch a normal prio reader. > > > ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & > > > pid1=$! > > > echo $pid1 > > > > > > # Launch an RT reader > > > ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > > > pid2=$! > > > echo $pid2 > > > > > > wait $pid2 > > > echo "RT task finished" > > > ********************************************************************** > > > > > > Test1 > > > ===== > > > Test two readers (one RT class and one BE class) and see how BW is > > > allocated with-in cgroup > > > > > > With io-throttle patches > > > ------------------------ > > > - Two readers, first BE prio 7, second RT prio 0 > > > > > > 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s > > > 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s > > > RT task finished > > > > > > Note: See, there is no difference in the performance of RT or BE task. > > > Looks like these got throttled equally. > > > > OK, this is coherent with the current io-throttle implementation. IO > > requests are throttled without the concept of the ioprio model. > > > > We could try to distribute the throttle using a function of each task's > > ioprio, but ok, the obvious drawback is that it totally breaks the logic > > used by the underlying layers. > > > > BTW, I'm wondering, is it a very critical issue? I would say why not to > > move the RT task to a different cgroup with unlimited BW? or limited BW > > but with other tasks running at the same IO priority... > > So one of hypothetical use case probably could be following. Somebody > is having a hosted server and customers are going to get there > applications running in a particular cgroup with a limit on max bw. > > root > / | \ > cust1 cust2 cust3 > (20 MB/s) (40MB/s) (30MB/s) > > Now all three customers will run their own applications/virtual machines > in their respective groups with upper limits. Will we say to these that > all your tasks will be considered as same class and same prio level. > > Assume cust1 is running a hypothetical application which creates multiple > threads and assigns these threads different priorities based on its needs > at run time. How would we handle this thing? > > You can't collect all the RT tasks from all customers and move these to a > single cgroup. Or ask customers to separate out their tasks based on > priority level and give them multiple groups of different priority. Clear. Unfortunately, I think, with absolute BW limits at a certain point, if we hit the limit, we need to block the IO request. That's the same either, when we dispatch or submit the request. And the risk is to break the logic of the IO priorities and fall in the classic priority inversion problem. The difference is that probably working at the CFQ level gives a better control so we can handle these cases appropriately and avoid the priority inversion problems. Thanks, -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 22:35 ` Andrea Righi @ 2009-05-07 1:48 ` Ryo Tsuruta 2009-05-07 1:48 ` Ryo Tsuruta 1 sibling, 0 replies; 97+ messages in thread From: Ryo Tsuruta @ 2009-05-07 1:48 UTC (permalink / raw) To: righi.andrea-Re5JQEeQqe8AvxtiuMwx3w Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b From: Andrea Righi <righi.andrea-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> Subject: Re: IO scheduler based IO Controller V2 Date: Thu, 7 May 2009 00:35:13 +0200 > On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > > On Wed, May 06, 2009 at 11:34:54PM +0200, Andrea Righi wrote: > > > On Wed, May 06, 2009 at 04:32:28PM -0400, Vivek Goyal wrote: > > > > Hi Andrea and others, > > > > > > > > I always had this doubt in mind that any kind of 2nd level controller will > > > > have no idea about underlying IO scheduler queues/semantics. So while it > > > > can implement a particular cgroup policy (max bw like io-throttle or > > > > proportional bw like dm-ioband) but there are high chances that it will > > > > break IO scheduler's semantics in one way or other. > > > > > > > > I had already sent out the results for dm-ioband in a separate thread. > > > > > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html > > > > > > > > Here are some basic results with io-throttle. Andrea, please let me know > > > > if you think this is procedural problem. Playing with io-throttle patches > > > > for the first time. > > > > > > > > I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ > > > > scheduler. > > > > > > > > I have got one SATA drive with one partition on it. > > > > > > > > I am trying to create one cgroup and assignn 8MB/s limit to it and launch > > > > on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided > > > > between these tasks. Following are the results. > > > > > > > > Following is my test script. > > > > > > > > ******************************************************************* > > > > #!/bin/bash > > > > > > > > mount /dev/sdb1 /mnt/sdb > > > > > > > > mount -t cgroup -o blockio blockio /cgroup/iot/ > > > > mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 > > > > > > > > # Set bw limit of 8 MB/ps on sdb > > > > echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > > > > > /cgroup/iot/test1/blockio.bandwidth-max > > > > > > > > sync > > > > echo 3 > /proc/sys/vm/drop_caches > > > > > > > > echo $$ > /cgroup/iot/test1/tasks > > > > > > > > # Launch a normal prio reader. > > > > ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & > > > > pid1=$! > > > > echo $pid1 > > > > > > > > # Launch an RT reader > > > > ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > > > > pid2=$! > > > > echo $pid2 > > > > > > > > wait $pid2 > > > > echo "RT task finished" > > > > ********************************************************************** > > > > > > > > Test1 > > > > ===== > > > > Test two readers (one RT class and one BE class) and see how BW is > > > > allocated with-in cgroup > > > > > > > > With io-throttle patches > > > > ------------------------ > > > > - Two readers, first BE prio 7, second RT prio 0 > > > > > > > > 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s > > > > 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s > > > > RT task finished > > > > > > > > Note: See, there is no difference in the performance of RT or BE task. > > > > Looks like these got throttled equally. > > > > > > OK, this is coherent with the current io-throttle implementation. IO > > > requests are throttled without the concept of the ioprio model. > > > > > > We could try to distribute the throttle using a function of each task's > > > ioprio, but ok, the obvious drawback is that it totally breaks the logic > > > used by the underlying layers. > > > > > > BTW, I'm wondering, is it a very critical issue? I would say why not to > > > move the RT task to a different cgroup with unlimited BW? or limited BW > > > but with other tasks running at the same IO priority... > > > > So one of hypothetical use case probably could be following. Somebody > > is having a hosted server and customers are going to get there > > applications running in a particular cgroup with a limit on max bw. > > > > root > > / | \ > > cust1 cust2 cust3 > > (20 MB/s) (40MB/s) (30MB/s) > > > > Now all three customers will run their own applications/virtual machines > > in their respective groups with upper limits. Will we say to these that > > all your tasks will be considered as same class and same prio level. > > > > Assume cust1 is running a hypothetical application which creates multiple > > threads and assigns these threads different priorities based on its needs > > at run time. How would we handle this thing? > > > > You can't collect all the RT tasks from all customers and move these to a > > single cgroup. Or ask customers to separate out their tasks based on > > priority level and give them multiple groups of different priority. > > Clear. > > Unfortunately, I think, with absolute BW limits at a certain point, if > we hit the limit, we need to block the IO request. That's the same > either, when we dispatch or submit the request. And the risk is to break > the logic of the IO priorities and fall in the classic priority > inversion problem. > > The difference is that probably working at the CFQ level gives a better > control so we can handle these cases appropriately and avoid the > priority inversion problems. > > Thanks, > -Andrea If RT tasks in cust1 issue IOs intensively, are IOs issued from BE tasks running on cust2 and cust3 suppressed and cust1 can use whole bandwidth? I think that CFQ's class and priority should be preserved within a given bandwidth to each cgroup. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 22:35 ` Andrea Righi 2009-05-07 1:48 ` Ryo Tsuruta @ 2009-05-07 1:48 ` Ryo Tsuruta 1 sibling, 0 replies; 97+ messages in thread From: Ryo Tsuruta @ 2009-05-07 1:48 UTC (permalink / raw) To: righi.andrea Cc: vgoyal, akpm, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz From: Andrea Righi <righi.andrea@gmail.com> Subject: Re: IO scheduler based IO Controller V2 Date: Thu, 7 May 2009 00:35:13 +0200 > On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > > On Wed, May 06, 2009 at 11:34:54PM +0200, Andrea Righi wrote: > > > On Wed, May 06, 2009 at 04:32:28PM -0400, Vivek Goyal wrote: > > > > Hi Andrea and others, > > > > > > > > I always had this doubt in mind that any kind of 2nd level controller will > > > > have no idea about underlying IO scheduler queues/semantics. So while it > > > > can implement a particular cgroup policy (max bw like io-throttle or > > > > proportional bw like dm-ioband) but there are high chances that it will > > > > break IO scheduler's semantics in one way or other. > > > > > > > > I had already sent out the results for dm-ioband in a separate thread. > > > > > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html > > > > > > > > Here are some basic results with io-throttle. Andrea, please let me know > > > > if you think this is procedural problem. Playing with io-throttle patches > > > > for the first time. > > > > > > > > I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ > > > > scheduler. > > > > > > > > I have got one SATA drive with one partition on it. > > > > > > > > I am trying to create one cgroup and assignn 8MB/s limit to it and launch > > > > on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided > > > > between these tasks. Following are the results. > > > > > > > > Following is my test script. > > > > > > > > ******************************************************************* > > > > #!/bin/bash > > > > > > > > mount /dev/sdb1 /mnt/sdb > > > > > > > > mount -t cgroup -o blockio blockio /cgroup/iot/ > > > > mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 > > > > > > > > # Set bw limit of 8 MB/ps on sdb > > > > echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > > > > > /cgroup/iot/test1/blockio.bandwidth-max > > > > > > > > sync > > > > echo 3 > /proc/sys/vm/drop_caches > > > > > > > > echo $$ > /cgroup/iot/test1/tasks > > > > > > > > # Launch a normal prio reader. > > > > ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & > > > > pid1=$! > > > > echo $pid1 > > > > > > > > # Launch an RT reader > > > > ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > > > > pid2=$! > > > > echo $pid2 > > > > > > > > wait $pid2 > > > > echo "RT task finished" > > > > ********************************************************************** > > > > > > > > Test1 > > > > ===== > > > > Test two readers (one RT class and one BE class) and see how BW is > > > > allocated with-in cgroup > > > > > > > > With io-throttle patches > > > > ------------------------ > > > > - Two readers, first BE prio 7, second RT prio 0 > > > > > > > > 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s > > > > 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s > > > > RT task finished > > > > > > > > Note: See, there is no difference in the performance of RT or BE task. > > > > Looks like these got throttled equally. > > > > > > OK, this is coherent with the current io-throttle implementation. IO > > > requests are throttled without the concept of the ioprio model. > > > > > > We could try to distribute the throttle using a function of each task's > > > ioprio, but ok, the obvious drawback is that it totally breaks the logic > > > used by the underlying layers. > > > > > > BTW, I'm wondering, is it a very critical issue? I would say why not to > > > move the RT task to a different cgroup with unlimited BW? or limited BW > > > but with other tasks running at the same IO priority... > > > > So one of hypothetical use case probably could be following. Somebody > > is having a hosted server and customers are going to get there > > applications running in a particular cgroup with a limit on max bw. > > > > root > > / | \ > > cust1 cust2 cust3 > > (20 MB/s) (40MB/s) (30MB/s) > > > > Now all three customers will run their own applications/virtual machines > > in their respective groups with upper limits. Will we say to these that > > all your tasks will be considered as same class and same prio level. > > > > Assume cust1 is running a hypothetical application which creates multiple > > threads and assigns these threads different priorities based on its needs > > at run time. How would we handle this thing? > > > > You can't collect all the RT tasks from all customers and move these to a > > single cgroup. Or ask customers to separate out their tasks based on > > priority level and give them multiple groups of different priority. > > Clear. > > Unfortunately, I think, with absolute BW limits at a certain point, if > we hit the limit, we need to block the IO request. That's the same > either, when we dispatch or submit the request. And the risk is to break > the logic of the IO priorities and fall in the classic priority > inversion problem. > > The difference is that probably working at the CFQ level gives a better > control so we can handle these cases appropriately and avoid the > priority inversion problems. > > Thanks, > -Andrea If RT tasks in cust1 issue IOs intensively, are IOs issued from BE tasks running on cust2 and cust3 suppressed and cust1 can use whole bandwidth? I think that CFQ's class and priority should be preserved within a given bandwidth to each cgroup. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090506215235.GJ8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506215235.GJ8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-06 22:35 ` Andrea Righi 2009-05-07 9:04 ` Andrea Righi 1 sibling, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-06 22:35 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > On Wed, May 06, 2009 at 11:34:54PM +0200, Andrea Righi wrote: > > On Wed, May 06, 2009 at 04:32:28PM -0400, Vivek Goyal wrote: > > > Hi Andrea and others, > > > > > > I always had this doubt in mind that any kind of 2nd level controller will > > > have no idea about underlying IO scheduler queues/semantics. So while it > > > can implement a particular cgroup policy (max bw like io-throttle or > > > proportional bw like dm-ioband) but there are high chances that it will > > > break IO scheduler's semantics in one way or other. > > > > > > I had already sent out the results for dm-ioband in a separate thread. > > > > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07258.html > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg07573.html > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08177.html > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08345.html > > > http://linux.derkeiler.com/Mailing-Lists/Kernel/2009-04/msg08355.html > > > > > > Here are some basic results with io-throttle. Andrea, please let me know > > > if you think this is procedural problem. Playing with io-throttle patches > > > for the first time. > > > > > > I took V16 of your patches and trying it out with 2.6.30-rc4 with CFQ > > > scheduler. > > > > > > I have got one SATA drive with one partition on it. > > > > > > I am trying to create one cgroup and assignn 8MB/s limit to it and launch > > > on RT prio 0 task and one BE prio 7 task and see how this 8MB/s is divided > > > between these tasks. Following are the results. > > > > > > Following is my test script. > > > > > > ******************************************************************* > > > #!/bin/bash > > > > > > mount /dev/sdb1 /mnt/sdb > > > > > > mount -t cgroup -o blockio blockio /cgroup/iot/ > > > mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 > > > > > > # Set bw limit of 8 MB/ps on sdb > > > echo "/dev/sdb:$((8 * 1024 * 1024)):0:0" > > > > /cgroup/iot/test1/blockio.bandwidth-max > > > > > > sync > > > echo 3 > /proc/sys/vm/drop_caches > > > > > > echo $$ > /cgroup/iot/test1/tasks > > > > > > # Launch a normal prio reader. > > > ionice -c 2 -n 7 dd if=/mnt/sdb/zerofile1 of=/dev/zero & > > > pid1=$! > > > echo $pid1 > > > > > > # Launch an RT reader > > > ionice -c 1 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > > > pid2=$! > > > echo $pid2 > > > > > > wait $pid2 > > > echo "RT task finished" > > > ********************************************************************** > > > > > > Test1 > > > ===== > > > Test two readers (one RT class and one BE class) and see how BW is > > > allocated with-in cgroup > > > > > > With io-throttle patches > > > ------------------------ > > > - Two readers, first BE prio 7, second RT prio 0 > > > > > > 234179072 bytes (234 MB) copied, 55.8482 s, 4.2 MB/s > > > 234179072 bytes (234 MB) copied, 55.8975 s, 4.2 MB/s > > > RT task finished > > > > > > Note: See, there is no difference in the performance of RT or BE task. > > > Looks like these got throttled equally. > > > > OK, this is coherent with the current io-throttle implementation. IO > > requests are throttled without the concept of the ioprio model. > > > > We could try to distribute the throttle using a function of each task's > > ioprio, but ok, the obvious drawback is that it totally breaks the logic > > used by the underlying layers. > > > > BTW, I'm wondering, is it a very critical issue? I would say why not to > > move the RT task to a different cgroup with unlimited BW? or limited BW > > but with other tasks running at the same IO priority... > > So one of hypothetical use case probably could be following. Somebody > is having a hosted server and customers are going to get there > applications running in a particular cgroup with a limit on max bw. > > root > / | \ > cust1 cust2 cust3 > (20 MB/s) (40MB/s) (30MB/s) > > Now all three customers will run their own applications/virtual machines > in their respective groups with upper limits. Will we say to these that > all your tasks will be considered as same class and same prio level. > > Assume cust1 is running a hypothetical application which creates multiple > threads and assigns these threads different priorities based on its needs > at run time. How would we handle this thing? > > You can't collect all the RT tasks from all customers and move these to a > single cgroup. Or ask customers to separate out their tasks based on > priority level and give them multiple groups of different priority. Clear. Unfortunately, I think, with absolute BW limits at a certain point, if we hit the limit, we need to block the IO request. That's the same either, when we dispatch or submit the request. And the risk is to break the logic of the IO priorities and fall in the classic priority inversion problem. The difference is that probably working at the CFQ level gives a better control so we can handle these cases appropriately and avoid the priority inversion problems. Thanks, -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506215235.GJ8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-06 22:35 ` Andrea Righi @ 2009-05-07 9:04 ` Andrea Righi 1 sibling, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-07 9:04 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > > > Without io-throttle patches > > > --------------------------- > > > - Two readers, first BE prio 7, second BE prio 0 > > > > > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > > > High prio reader finished > > > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > > > > > Note: There is no service differentiation between prio 0 and prio 7 task > > > with io-throttle patches. > > > > > > Test 3 > > > ====== > > > - Run the one RT reader and one BE reader in root cgroup without any > > > limitations. I guess this should mean unlimited BW and behavior should > > > be same as with CFQ without io-throttling patches. > > > > > > With io-throttle patches > > > ========================= > > > Ran the test 4 times because I was getting different results in different > > > runs. > > > > > > - Two readers, one RT prio 0 other BE prio 7 > > > > > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > > > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > > > RT task finished > > > > > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > > > RT task finished > > > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > > > > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > > > RT task finished > > > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > > > > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > > > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > > > RT task finished > > > > > > Note: Out of 4 runs, looks like twice it is complete priority inversion > > > and RT task finished after BE task. Rest of the two times, the > > > difference between BW of RT and BE task is much less as compared to > > > without patches. In fact once it was almost same. > > > > This is strange. If you don't set any limit there shouldn't be any > > difference respect to the other case (without io-throttle patches). > > > > At worst a small overhead given by the task_to_iothrottle(), under > > rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to > > reproduce this strange behaviour. > > Ya, I also found this strange. At least in root group there should not be > any behavior change (at max one might expect little drop in throughput > because of extra code). Hi Vivek, I'm not able to reproduce the strange behaviour above. Which commands are you running exactly? is the system isolated (stupid question) no cron or background tasks doing IO during the tests? Following the script I've used: $ cat test.sh #!/bin/sh echo 3 > /proc/sys/vm/drop_caches ionice -c 1 -n 0 dd if=bigfile1 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/RT: \1/" & cat /proc/$!/cgroup | sed "s/\(.*\)/RT: \1/" ionice -c 2 -n 7 dd if=bigfile2 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/BE: \1/" & cat /proc/$!/cgroup | sed "s/\(.*\)/BE: \1/" for i in 1 2; do wait done And the results on my PC: 2.6.30-rc4 ~~~~~~~~~~ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 21.3406 s, 11.5 MB/s RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.989 s, 20.5 MB/s $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.4436 s, 10.5 MB/s RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.9555 s, 20.5 MB/s $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 21.622 s, 11.3 MB/s RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.9856 s, 20.5 MB/s $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 21.5664 s, 11.4 MB/s RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.8522 s, 20.7 MB/s 2.6.30-rc4 + io-throttle, no BW limit, both tasks in the root cgroup ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ $ sudo sh ./test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.6739 s, 10.4 MB/s BE: cgroup 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 12.2853 s, 20.0 MB/s RT: 4:blockio:/ $ sudo sh ./test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.7483 s, 10.3 MB/s BE: cgroup 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 12.3597 s, 19.9 MB/s RT: 4:blockio:/ $ sudo sh ./test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.6843 s, 10.4 MB/s BE: cgroup 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 12.4886 s, 19.6 MB/s RT: 4:blockio:/ $ sudo sh ./test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.8621 s, 10.3 MB/s BE: cgroup 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 12.6737 s, 19.4 MB/s RT: 4:blockio:/ The difference seems to be just the expected overhead. -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 21:52 ` Vivek Goyal ` (2 preceding siblings ...) (?) @ 2009-05-07 9:04 ` Andrea Righi 2009-05-07 12:22 ` Andrea Righi ` (3 more replies) -1 siblings, 4 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-07 9:04 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > > > Without io-throttle patches > > > --------------------------- > > > - Two readers, first BE prio 7, second BE prio 0 > > > > > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > > > High prio reader finished > > > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > > > > > Note: There is no service differentiation between prio 0 and prio 7 task > > > with io-throttle patches. > > > > > > Test 3 > > > ====== > > > - Run the one RT reader and one BE reader in root cgroup without any > > > limitations. I guess this should mean unlimited BW and behavior should > > > be same as with CFQ without io-throttling patches. > > > > > > With io-throttle patches > > > ========================= > > > Ran the test 4 times because I was getting different results in different > > > runs. > > > > > > - Two readers, one RT prio 0 other BE prio 7 > > > > > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > > > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > > > RT task finished > > > > > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > > > RT task finished > > > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > > > > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > > > RT task finished > > > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > > > > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > > > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > > > RT task finished > > > > > > Note: Out of 4 runs, looks like twice it is complete priority inversion > > > and RT task finished after BE task. Rest of the two times, the > > > difference between BW of RT and BE task is much less as compared to > > > without patches. In fact once it was almost same. > > > > This is strange. If you don't set any limit there shouldn't be any > > difference respect to the other case (without io-throttle patches). > > > > At worst a small overhead given by the task_to_iothrottle(), under > > rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to > > reproduce this strange behaviour. > > Ya, I also found this strange. At least in root group there should not be > any behavior change (at max one might expect little drop in throughput > because of extra code). Hi Vivek, I'm not able to reproduce the strange behaviour above. Which commands are you running exactly? is the system isolated (stupid question) no cron or background tasks doing IO during the tests? Following the script I've used: $ cat test.sh #!/bin/sh echo 3 > /proc/sys/vm/drop_caches ionice -c 1 -n 0 dd if=bigfile1 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/RT: \1/" & cat /proc/$!/cgroup | sed "s/\(.*\)/RT: \1/" ionice -c 2 -n 7 dd if=bigfile2 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/BE: \1/" & cat /proc/$!/cgroup | sed "s/\(.*\)/BE: \1/" for i in 1 2; do wait done And the results on my PC: 2.6.30-rc4 ~~~~~~~~~~ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 21.3406 s, 11.5 MB/s RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.989 s, 20.5 MB/s $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.4436 s, 10.5 MB/s RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.9555 s, 20.5 MB/s $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 21.622 s, 11.3 MB/s RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.9856 s, 20.5 MB/s $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 21.5664 s, 11.4 MB/s RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.8522 s, 20.7 MB/s 2.6.30-rc4 + io-throttle, no BW limit, both tasks in the root cgroup ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ $ sudo sh ./test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.6739 s, 10.4 MB/s BE: cgroup 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 12.2853 s, 20.0 MB/s RT: 4:blockio:/ $ sudo sh ./test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.7483 s, 10.3 MB/s BE: cgroup 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 12.3597 s, 19.9 MB/s RT: 4:blockio:/ $ sudo sh ./test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.6843 s, 10.4 MB/s BE: cgroup 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 12.4886 s, 19.6 MB/s RT: 4:blockio:/ $ sudo sh ./test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 23.8621 s, 10.3 MB/s BE: cgroup 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 12.6737 s, 19.4 MB/s RT: 4:blockio:/ The difference seems to be just the expected overhead. -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 9:04 ` Andrea Righi @ 2009-05-07 12:22 ` Andrea Righi 2009-05-07 12:22 ` Andrea Righi ` (2 subsequent siblings) 3 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-07 12:22 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Thu, May 07, 2009 at 11:04:50AM +0200, Andrea Righi wrote: > On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > > > > Without io-throttle patches > > > > --------------------------- > > > > - Two readers, first BE prio 7, second BE prio 0 > > > > > > > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > > > > High prio reader finished > > > > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > > > > > > > Note: There is no service differentiation between prio 0 and prio 7 task > > > > with io-throttle patches. > > > > > > > > Test 3 > > > > ====== > > > > - Run the one RT reader and one BE reader in root cgroup without any > > > > limitations. I guess this should mean unlimited BW and behavior should > > > > be same as with CFQ without io-throttling patches. > > > > > > > > With io-throttle patches > > > > ========================= > > > > Ran the test 4 times because I was getting different results in different > > > > runs. > > > > > > > > - Two readers, one RT prio 0 other BE prio 7 > > > > > > > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > > > > RT task finished > > > > > > > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > > > > RT task finished > > > > > > > > Note: Out of 4 runs, looks like twice it is complete priority inversion > > > > and RT task finished after BE task. Rest of the two times, the > > > > difference between BW of RT and BE task is much less as compared to > > > > without patches. In fact once it was almost same. > > > > > > This is strange. If you don't set any limit there shouldn't be any > > > difference respect to the other case (without io-throttle patches). > > > > > > At worst a small overhead given by the task_to_iothrottle(), under > > > rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to > > > reproduce this strange behaviour. > > > > Ya, I also found this strange. At least in root group there should not be > > any behavior change (at max one might expect little drop in throughput > > because of extra code). > > Hi Vivek, > > I'm not able to reproduce the strange behaviour above. > > Which commands are you running exactly? is the system isolated (stupid > question) no cron or background tasks doing IO during the tests? > > Following the script I've used: > > $ cat test.sh > #!/bin/sh > echo 3 > /proc/sys/vm/drop_caches > ionice -c 1 -n 0 dd if=bigfile1 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/RT: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/RT: \1/" > ionice -c 2 -n 7 dd if=bigfile2 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/BE: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/BE: \1/" > for i in 1 2; do > wait > done > > And the results on my PC: > > 2.6.30-rc4 > ~~~~~~~~~~ > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 21.3406 s, 11.5 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.989 s, 20.5 MB/s > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.4436 s, 10.5 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.9555 s, 20.5 MB/s > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 21.622 s, 11.3 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.9856 s, 20.5 MB/s > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 21.5664 s, 11.4 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.8522 s, 20.7 MB/s > > 2.6.30-rc4 + io-throttle, no BW limit, both tasks in the root cgroup > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.6739 s, 10.4 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.2853 s, 20.0 MB/s > RT: 4:blockio:/ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.7483 s, 10.3 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.3597 s, 19.9 MB/s > RT: 4:blockio:/ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.6843 s, 10.4 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.4886 s, 19.6 MB/s > RT: 4:blockio:/ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.8621 s, 10.3 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.6737 s, 19.4 MB/s > RT: 4:blockio:/ > > The difference seems to be just the expected overhead. BTW, it is possible to reduce the io-throttle overhead even more for non io-throttle users (also when CONFIG_CGROUP_IO_THROTTLE is enabled) using the trick below. 2.6.30-rc4 + io-throttle + following patch, no BW limit, tasks in root cgroup ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 17.462 s, 14.1 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.7865 s, 20.8 MB/s RT: 4:blockio:/ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 18.8375 s, 13.0 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.9148 s, 20.6 MB/s RT: 4:blockio:/ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 19.6826 s, 12.5 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.8715 s, 20.7 MB/s RT: 4:blockio:/ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 18.9152 s, 13.0 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.8925 s, 20.6 MB/s RT: 4:blockio:/ [ To be applied on top of io-throttle v16 ] Signed-off-by: Andrea Righi <righi.andrea@gmail.com> --- block/blk-io-throttle.c | 16 ++++++++++++++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/block/blk-io-throttle.c b/block/blk-io-throttle.c index e2dfd24..8b45c71 100644 --- a/block/blk-io-throttle.c +++ b/block/blk-io-throttle.c @@ -131,6 +131,14 @@ struct iothrottle_node { struct iothrottle_stat stat; }; +/* + * This is a trick to reduce the unneded overhead when io-throttle is not used + * at all. We use a counter of the io-throttle rules; if the counter is zero, + * we immediately return from the io-throttle hooks, without accounting IO and + * without checking if we need to apply some limiting rules. + */ +static atomic_t iothrottle_node_count __read_mostly; + /** * struct iothrottle - throttling rules for a cgroup * @css: pointer to the cgroup state @@ -193,6 +201,7 @@ static void iothrottle_insert_node(struct iothrottle *iot, { WARN_ON_ONCE(!cgroup_is_locked()); list_add_rcu(&n->node, &iot->list); + atomic_inc(&iothrottle_node_count); } /* @@ -214,6 +223,7 @@ iothrottle_delete_node(struct iothrottle *iot, struct iothrottle_node *n) { WARN_ON_ONCE(!cgroup_is_locked()); list_del_rcu(&n->node); + atomic_dec(&iothrottle_node_count); } /* @@ -250,8 +260,10 @@ static void iothrottle_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp) * reference to the list. */ if (!list_empty(&iot->list)) - list_for_each_entry_safe(n, p, &iot->list, node) + list_for_each_entry_safe(n, p, &iot->list, node) { kfree(n); + atomic_dec(&iothrottle_node_count); + } kfree(iot); } @@ -836,7 +848,7 @@ cgroup_io_throttle(struct bio *bio, struct block_device *bdev, ssize_t bytes) unsigned long long sleep; int type, can_sleep = 1; - if (iothrottle_disabled()) + if (iothrottle_disabled() || !atomic_read(&iothrottle_node_count)) return 0; if (unlikely(!bdev)) return 0; ^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 9:04 ` Andrea Righi 2009-05-07 12:22 ` Andrea Righi @ 2009-05-07 12:22 ` Andrea Righi 2009-05-07 14:11 ` Vivek Goyal 2009-05-07 14:11 ` Vivek Goyal 3 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-07 12:22 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Thu, May 07, 2009 at 11:04:50AM +0200, Andrea Righi wrote: > On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > > > > Without io-throttle patches > > > > --------------------------- > > > > - Two readers, first BE prio 7, second BE prio 0 > > > > > > > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > > > > High prio reader finished > > > > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > > > > > > > Note: There is no service differentiation between prio 0 and prio 7 task > > > > with io-throttle patches. > > > > > > > > Test 3 > > > > ====== > > > > - Run the one RT reader and one BE reader in root cgroup without any > > > > limitations. I guess this should mean unlimited BW and behavior should > > > > be same as with CFQ without io-throttling patches. > > > > > > > > With io-throttle patches > > > > ========================= > > > > Ran the test 4 times because I was getting different results in different > > > > runs. > > > > > > > > - Two readers, one RT prio 0 other BE prio 7 > > > > > > > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > > > > RT task finished > > > > > > > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > > > > RT task finished > > > > > > > > Note: Out of 4 runs, looks like twice it is complete priority inversion > > > > and RT task finished after BE task. Rest of the two times, the > > > > difference between BW of RT and BE task is much less as compared to > > > > without patches. In fact once it was almost same. > > > > > > This is strange. If you don't set any limit there shouldn't be any > > > difference respect to the other case (without io-throttle patches). > > > > > > At worst a small overhead given by the task_to_iothrottle(), under > > > rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to > > > reproduce this strange behaviour. > > > > Ya, I also found this strange. At least in root group there should not be > > any behavior change (at max one might expect little drop in throughput > > because of extra code). > > Hi Vivek, > > I'm not able to reproduce the strange behaviour above. > > Which commands are you running exactly? is the system isolated (stupid > question) no cron or background tasks doing IO during the tests? > > Following the script I've used: > > $ cat test.sh > #!/bin/sh > echo 3 > /proc/sys/vm/drop_caches > ionice -c 1 -n 0 dd if=bigfile1 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/RT: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/RT: \1/" > ionice -c 2 -n 7 dd if=bigfile2 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/BE: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/BE: \1/" > for i in 1 2; do > wait > done > > And the results on my PC: > > 2.6.30-rc4 > ~~~~~~~~~~ > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 21.3406 s, 11.5 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.989 s, 20.5 MB/s > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.4436 s, 10.5 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.9555 s, 20.5 MB/s > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 21.622 s, 11.3 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.9856 s, 20.5 MB/s > $ sudo sh test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 21.5664 s, 11.4 MB/s > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 11.8522 s, 20.7 MB/s > > 2.6.30-rc4 + io-throttle, no BW limit, both tasks in the root cgroup > ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.6739 s, 10.4 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.2853 s, 20.0 MB/s > RT: 4:blockio:/ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.7483 s, 10.3 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.3597 s, 19.9 MB/s > RT: 4:blockio:/ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.6843 s, 10.4 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.4886 s, 19.6 MB/s > RT: 4:blockio:/ > $ sudo sh ./test.sh | sort > BE: 234+0 records in > BE: 234+0 records out > BE: 245366784 bytes (245 MB) copied, 23.8621 s, 10.3 MB/s > BE: cgroup 4:blockio:/ > RT: 234+0 records in > RT: 234+0 records out > RT: 245366784 bytes (245 MB) copied, 12.6737 s, 19.4 MB/s > RT: 4:blockio:/ > > The difference seems to be just the expected overhead. BTW, it is possible to reduce the io-throttle overhead even more for non io-throttle users (also when CONFIG_CGROUP_IO_THROTTLE is enabled) using the trick below. 2.6.30-rc4 + io-throttle + following patch, no BW limit, tasks in root cgroup ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 17.462 s, 14.1 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.7865 s, 20.8 MB/s RT: 4:blockio:/ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 18.8375 s, 13.0 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.9148 s, 20.6 MB/s RT: 4:blockio:/ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 19.6826 s, 12.5 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.8715 s, 20.7 MB/s RT: 4:blockio:/ $ sudo sh test.sh | sort BE: 234+0 records in BE: 234+0 records out BE: 245366784 bytes (245 MB) copied, 18.9152 s, 13.0 MB/s BE: 4:blockio:/ RT: 234+0 records in RT: 234+0 records out RT: 245366784 bytes (245 MB) copied, 11.8925 s, 20.6 MB/s RT: 4:blockio:/ [ To be applied on top of io-throttle v16 ] Signed-off-by: Andrea Righi <righi.andrea-Re5JQEeQqe8AvxtiuMwx3w@public.gmane.org> --- block/blk-io-throttle.c | 16 ++++++++++++++-- 1 files changed, 14 insertions(+), 2 deletions(-) diff --git a/block/blk-io-throttle.c b/block/blk-io-throttle.c index e2dfd24..8b45c71 100644 --- a/block/blk-io-throttle.c +++ b/block/blk-io-throttle.c @@ -131,6 +131,14 @@ struct iothrottle_node { struct iothrottle_stat stat; }; +/* + * This is a trick to reduce the unneded overhead when io-throttle is not used + * at all. We use a counter of the io-throttle rules; if the counter is zero, + * we immediately return from the io-throttle hooks, without accounting IO and + * without checking if we need to apply some limiting rules. + */ +static atomic_t iothrottle_node_count __read_mostly; + /** * struct iothrottle - throttling rules for a cgroup * @css: pointer to the cgroup state @@ -193,6 +201,7 @@ static void iothrottle_insert_node(struct iothrottle *iot, { WARN_ON_ONCE(!cgroup_is_locked()); list_add_rcu(&n->node, &iot->list); + atomic_inc(&iothrottle_node_count); } /* @@ -214,6 +223,7 @@ iothrottle_delete_node(struct iothrottle *iot, struct iothrottle_node *n) { WARN_ON_ONCE(!cgroup_is_locked()); list_del_rcu(&n->node); + atomic_dec(&iothrottle_node_count); } /* @@ -250,8 +260,10 @@ static void iothrottle_destroy(struct cgroup_subsys *ss, struct cgroup *cgrp) * reference to the list. */ if (!list_empty(&iot->list)) - list_for_each_entry_safe(n, p, &iot->list, node) + list_for_each_entry_safe(n, p, &iot->list, node) { kfree(n); + atomic_dec(&iothrottle_node_count); + } kfree(iot); } @@ -836,7 +848,7 @@ cgroup_io_throttle(struct bio *bio, struct block_device *bdev, ssize_t bytes) unsigned long long sleep; int type, can_sleep = 1; - if (iothrottle_disabled()) + if (iothrottle_disabled() || !atomic_read(&iothrottle_node_count)) return 0; if (unlikely(!bdev)) return 0; ^ permalink raw reply related [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 9:04 ` Andrea Righi 2009-05-07 12:22 ` Andrea Righi 2009-05-07 12:22 ` Andrea Righi @ 2009-05-07 14:11 ` Vivek Goyal 2009-05-07 14:11 ` Vivek Goyal 3 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 14:11 UTC (permalink / raw) To: Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Thu, May 07, 2009 at 11:04:50AM +0200, Andrea Righi wrote: > On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > > > > Without io-throttle patches > > > > --------------------------- > > > > - Two readers, first BE prio 7, second BE prio 0 > > > > > > > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > > > > High prio reader finished > > > > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > > > > > > > Note: There is no service differentiation between prio 0 and prio 7 task > > > > with io-throttle patches. > > > > > > > > Test 3 > > > > ====== > > > > - Run the one RT reader and one BE reader in root cgroup without any > > > > limitations. I guess this should mean unlimited BW and behavior should > > > > be same as with CFQ without io-throttling patches. > > > > > > > > With io-throttle patches > > > > ========================= > > > > Ran the test 4 times because I was getting different results in different > > > > runs. > > > > > > > > - Two readers, one RT prio 0 other BE prio 7 > > > > > > > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > > > > RT task finished > > > > > > > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > > > > RT task finished > > > > > > > > Note: Out of 4 runs, looks like twice it is complete priority inversion > > > > and RT task finished after BE task. Rest of the two times, the > > > > difference between BW of RT and BE task is much less as compared to > > > > without patches. In fact once it was almost same. > > > > > > This is strange. If you don't set any limit there shouldn't be any > > > difference respect to the other case (without io-throttle patches). > > > > > > At worst a small overhead given by the task_to_iothrottle(), under > > > rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to > > > reproduce this strange behaviour. > > > > Ya, I also found this strange. At least in root group there should not be > > any behavior change (at max one might expect little drop in throughput > > because of extra code). > > Hi Vivek, > > I'm not able to reproduce the strange behaviour above. > > Which commands are you running exactly? is the system isolated (stupid > question) no cron or background tasks doing IO during the tests? > > Following the script I've used: > > $ cat test.sh > #!/bin/sh > echo 3 > /proc/sys/vm/drop_caches > ionice -c 1 -n 0 dd if=bigfile1 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/RT: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/RT: \1/" > ionice -c 2 -n 7 dd if=bigfile2 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/BE: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/BE: \1/" > for i in 1 2; do > wait > done > > And the results on my PC: > [..] > The difference seems to be just the expected overhead. Hm..., something is really amiss here. I took your scripts and ran on my system and I still see the issue. There is nothing else running on the system and it is isolated. 2.6.30-rc4 + io-throttle patches V16 =================================== It is freshly booted system with nothing extra running on it. This is a 4 core system. Disk1 ===== This is a fast disk which supports queue depth of 31. Following is the output picked from dmesg for my device properties. [ 3.016099] sd 2:0:0:0: [sdb] 488397168 512-byte hardware sectors: (250 GB/232 GiB) [ 3.016188] sd 2:0:0:0: Attached scsi generic sg2 type 0 Following are the results of 4 runs of your script. (Just changed the script to read right file on my system if=/mnt/sdb/zerofile1). [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 4.38435 s, 53.4 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 5.20706 s, 45.0 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.12953 s, 45.7 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 5.23573 s, 44.7 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 3.54644 s, 66.0 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 5.19406 s, 45.1 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 5.21908 s, 44.9 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.23802 s, 44.7 MB/s Disk2 ===== This is a relatively slower disk with no command queuing. [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 7.06471 s, 33.1 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 8.01571 s, 29.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 7.89043 s, 29.7 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 8.03428 s, 29.1 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.38942 s, 31.7 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 8.01146 s, 29.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.78351 s, 30.1 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 8.06292 s, 29.0 MB/s Disk3 ===== This is an Intel SSD. [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.993735 s, 236 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.98772 s, 118 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 1.8616 s, 126 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.98499 s, 118 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 1.01174 s, 231 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.99143 s, 118 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 1.96132 s, 119 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.97746 s, 118 MB/s Results without io-throttle patches (vanilla 2.6.30-rc4) ======================================================== Disk 1 ====== This is relatively faster SATA drive with command queuing enabled. RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 2.84065 s, 82.4 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.30087 s, 44.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 2.69688 s, 86.8 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.18175 s, 45.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 2.73279 s, 85.7 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.21803 s, 44.9 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 2.69304 s, 87.0 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.17821 s, 45.2 MB/s Disk 2 ====== Slower disk with no command queuing. [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 4.29453 s, 54.5 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 8.04978 s, 29.1 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 3.96924 s, 59.0 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.74984 s, 30.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 4.11254 s, 56.9 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.8678 s, 29.8 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 3.95979 s, 59.1 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.73976 s, 30.3 MB/s Disk3 ===== Intel SSD [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.996762 s, 235 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.93268 s, 121 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.98511 s, 238 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.92481 s, 122 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.986981 s, 237 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.9312 s, 121 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.988448 s, 237 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.93885 s, 121 MB/s So I am still seeing the issue with differnt kind of disks also. At this point of time I am really not sure why I am seeing such results. I have following patches applied on 30-rc4 (V16). 3954-vivek.goyal2008-res_counter-introduce-ratelimiting-attributes.patch 3955-vivek.goyal2008-page_cgroup-provide-a-generic-page-tracking-infrastructure.patch 3956-vivek.goyal2008-io-throttle-controller-infrastructure.patch 3957-vivek.goyal2008-kiothrottled-throttle-buffered-io.patch 3958-vivek.goyal2008-io-throttle-instrumentation.patch 3959-vivek.goyal2008-io-throttle-export-per-task-statistics-to-userspace.patch Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 9:04 ` Andrea Righi ` (2 preceding siblings ...) 2009-05-07 14:11 ` Vivek Goyal @ 2009-05-07 14:11 ` Vivek Goyal [not found] ` <20090507141126.GA9463-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 3 siblings, 1 reply; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 14:11 UTC (permalink / raw) To: Andrea Righi Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Thu, May 07, 2009 at 11:04:50AM +0200, Andrea Righi wrote: > On Wed, May 06, 2009 at 05:52:35PM -0400, Vivek Goyal wrote: > > > > Without io-throttle patches > > > > --------------------------- > > > > - Two readers, first BE prio 7, second BE prio 0 > > > > > > > > 234179072 bytes (234 MB) copied, 4.12074 s, 56.8 MB/s > > > > High prio reader finished > > > > 234179072 bytes (234 MB) copied, 5.36023 s, 43.7 MB/s > > > > > > > > Note: There is no service differentiation between prio 0 and prio 7 task > > > > with io-throttle patches. > > > > > > > > Test 3 > > > > ====== > > > > - Run the one RT reader and one BE reader in root cgroup without any > > > > limitations. I guess this should mean unlimited BW and behavior should > > > > be same as with CFQ without io-throttling patches. > > > > > > > > With io-throttle patches > > > > ========================= > > > > Ran the test 4 times because I was getting different results in different > > > > runs. > > > > > > > > - Two readers, one RT prio 0 other BE prio 7 > > > > > > > > 234179072 bytes (234 MB) copied, 2.74604 s, 85.3 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20995 s, 44.9 MB/s > > > > RT task finished > > > > > > > > 234179072 bytes (234 MB) copied, 4.54417 s, 51.5 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.23396 s, 44.7 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 5.17727 s, 45.2 MB/s > > > > RT task finished > > > > 234179072 bytes (234 MB) copied, 5.25894 s, 44.5 MB/s > > > > > > > > 234179072 bytes (234 MB) copied, 2.74141 s, 85.4 MB/s > > > > 234179072 bytes (234 MB) copied, 5.20536 s, 45.0 MB/s > > > > RT task finished > > > > > > > > Note: Out of 4 runs, looks like twice it is complete priority inversion > > > > and RT task finished after BE task. Rest of the two times, the > > > > difference between BW of RT and BE task is much less as compared to > > > > without patches. In fact once it was almost same. > > > > > > This is strange. If you don't set any limit there shouldn't be any > > > difference respect to the other case (without io-throttle patches). > > > > > > At worst a small overhead given by the task_to_iothrottle(), under > > > rcu_read_lock(). I'll repeat this test ASAP and see if I'll be able to > > > reproduce this strange behaviour. > > > > Ya, I also found this strange. At least in root group there should not be > > any behavior change (at max one might expect little drop in throughput > > because of extra code). > > Hi Vivek, > > I'm not able to reproduce the strange behaviour above. > > Which commands are you running exactly? is the system isolated (stupid > question) no cron or background tasks doing IO during the tests? > > Following the script I've used: > > $ cat test.sh > #!/bin/sh > echo 3 > /proc/sys/vm/drop_caches > ionice -c 1 -n 0 dd if=bigfile1 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/RT: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/RT: \1/" > ionice -c 2 -n 7 dd if=bigfile2 of=/dev/null bs=1M 2>&1 | sed "s/\(.*\)/BE: \1/" & > cat /proc/$!/cgroup | sed "s/\(.*\)/BE: \1/" > for i in 1 2; do > wait > done > > And the results on my PC: > [..] > The difference seems to be just the expected overhead. Hm..., something is really amiss here. I took your scripts and ran on my system and I still see the issue. There is nothing else running on the system and it is isolated. 2.6.30-rc4 + io-throttle patches V16 =================================== It is freshly booted system with nothing extra running on it. This is a 4 core system. Disk1 ===== This is a fast disk which supports queue depth of 31. Following is the output picked from dmesg for my device properties. [ 3.016099] sd 2:0:0:0: [sdb] 488397168 512-byte hardware sectors: (250 GB/232 GiB) [ 3.016188] sd 2:0:0:0: Attached scsi generic sg2 type 0 Following are the results of 4 runs of your script. (Just changed the script to read right file on my system if=/mnt/sdb/zerofile1). [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 4.38435 s, 53.4 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 5.20706 s, 45.0 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.12953 s, 45.7 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 5.23573 s, 44.7 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 3.54644 s, 66.0 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 5.19406 s, 45.1 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 5.21908 s, 44.9 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.23802 s, 44.7 MB/s Disk2 ===== This is a relatively slower disk with no command queuing. [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 7.06471 s, 33.1 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 8.01571 s, 29.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 7.89043 s, 29.7 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 8.03428 s, 29.1 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.38942 s, 31.7 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 8.01146 s, 29.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.78351 s, 30.1 MB/s RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 8.06292 s, 29.0 MB/s Disk3 ===== This is an Intel SSD. [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.993735 s, 236 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.98772 s, 118 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 1.8616 s, 126 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.98499 s, 118 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 1.01174 s, 231 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.99143 s, 118 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 1.96132 s, 119 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.97746 s, 118 MB/s Results without io-throttle patches (vanilla 2.6.30-rc4) ======================================================== Disk 1 ====== This is relatively faster SATA drive with command queuing enabled. RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 2.84065 s, 82.4 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.30087 s, 44.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 2.69688 s, 86.8 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.18175 s, 45.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 2.73279 s, 85.7 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.21803 s, 44.9 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 2.69304 s, 87.0 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 5.17821 s, 45.2 MB/s Disk 2 ====== Slower disk with no command queuing. [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 4.29453 s, 54.5 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 8.04978 s, 29.1 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 3.96924 s, 59.0 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.74984 s, 30.2 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 4.11254 s, 56.9 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.8678 s, 29.8 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 3.95979 s, 59.1 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 7.73976 s, 30.3 MB/s Disk3 ===== Intel SSD [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.996762 s, 235 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.93268 s, 121 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.98511 s, 238 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.92481 s, 122 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.986981 s, 237 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.9312 s, 121 MB/s [root@chilli io-throttle-tests]# ./andrea-test-script.sh RT: 223+1 records in RT: 223+1 records out RT: 234179072 bytes (234 MB) copied, 0.988448 s, 237 MB/s BE: 223+1 records in BE: 223+1 records out BE: 234179072 bytes (234 MB) copied, 1.93885 s, 121 MB/s So I am still seeing the issue with differnt kind of disks also. At this point of time I am really not sure why I am seeing such results. I have following patches applied on 30-rc4 (V16). 3954-vivek.goyal2008-res_counter-introduce-ratelimiting-attributes.patch 3955-vivek.goyal2008-page_cgroup-provide-a-generic-page-tracking-infrastructure.patch 3956-vivek.goyal2008-io-throttle-controller-infrastructure.patch 3957-vivek.goyal2008-kiothrottled-throttle-buffered-io.patch 3958-vivek.goyal2008-io-throttle-instrumentation.patch 3959-vivek.goyal2008-io-throttle-export-per-task-statistics-to-userspace.patch Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090507141126.GA9463-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 2009-05-07 14:11 ` Vivek Goyal @ 2009-05-07 14:45 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 14:45 UTC (permalink / raw) To: Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Thu, May 07, 2009 at 10:11:26AM -0400, Vivek Goyal wrote: [..] > [root@chilli io-throttle-tests]# ./andrea-test-script.sh > RT: 223+1 records in > RT: 223+1 records out > RT: 234179072 bytes (234 MB) copied, 0.988448 s, 237 MB/s > BE: 223+1 records in > BE: 223+1 records out > BE: 234179072 bytes (234 MB) copied, 1.93885 s, 121 MB/s > > So I am still seeing the issue with differnt kind of disks also. At this point > of time I am really not sure why I am seeing such results. Hold on. I think I found the culprit here. I was thinking that what is the difference between two setups and realized that with vanilla kernels I had done "make defconfig" and with io-throttle kernels I had used an old config of my and did "make oldconfig". So basically config files were differnt. I now used the same config file and issues seems to have gone away. I will look into why an old config file can force such kind of issues. So now we are left with the issue of loosing the notion of priority and class with-in cgroup. In fact on bigger systems we will probably run into issues of kiothrottled scalability as single thread is trying to cater to all the disks. If we do max bw control at IO scheduler level, then I think we should be able to control max bw while maintaining the notion of priority and class with-in cgroup. Also there are multiple pdflush threads and jens seems to be pushing flusher threads per bdi which will help us achieve greater scalability and don't have to replicate that infrastructure for kiothrottled also. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-07 14:45 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 14:45 UTC (permalink / raw) To: Andrea Righi Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Thu, May 07, 2009 at 10:11:26AM -0400, Vivek Goyal wrote: [..] > [root@chilli io-throttle-tests]# ./andrea-test-script.sh > RT: 223+1 records in > RT: 223+1 records out > RT: 234179072 bytes (234 MB) copied, 0.988448 s, 237 MB/s > BE: 223+1 records in > BE: 223+1 records out > BE: 234179072 bytes (234 MB) copied, 1.93885 s, 121 MB/s > > So I am still seeing the issue with differnt kind of disks also. At this point > of time I am really not sure why I am seeing such results. Hold on. I think I found the culprit here. I was thinking that what is the difference between two setups and realized that with vanilla kernels I had done "make defconfig" and with io-throttle kernels I had used an old config of my and did "make oldconfig". So basically config files were differnt. I now used the same config file and issues seems to have gone away. I will look into why an old config file can force such kind of issues. So now we are left with the issue of loosing the notion of priority and class with-in cgroup. In fact on bigger systems we will probably run into issues of kiothrottled scalability as single thread is trying to cater to all the disks. If we do max bw control at IO scheduler level, then I think we should be able to control max bw while maintaining the notion of priority and class with-in cgroup. Also there are multiple pdflush threads and jens seems to be pushing flusher threads per bdi which will help us achieve greater scalability and don't have to replicate that infrastructure for kiothrottled also. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090507144501.GB9463-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 2009-05-07 14:45 ` Vivek Goyal @ 2009-05-07 15:36 ` Vivek Goyal -1 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 15:36 UTC (permalink / raw) To: Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Thu, May 07, 2009 at 10:45:01AM -0400, Vivek Goyal wrote: > On Thu, May 07, 2009 at 10:11:26AM -0400, Vivek Goyal wrote: > > [..] > > [root@chilli io-throttle-tests]# ./andrea-test-script.sh > > RT: 223+1 records in > > RT: 223+1 records out > > RT: 234179072 bytes (234 MB) copied, 0.988448 s, 237 MB/s > > BE: 223+1 records in > > BE: 223+1 records out > > BE: 234179072 bytes (234 MB) copied, 1.93885 s, 121 MB/s > > > > So I am still seeing the issue with differnt kind of disks also. At this point > > of time I am really not sure why I am seeing such results. > > Hold on. I think I found the culprit here. I was thinking that what is > the difference between two setups and realized that with vanilla kernels > I had done "make defconfig" and with io-throttle kernels I had used an > old config of my and did "make oldconfig". So basically config files > were differnt. > > I now used the same config file and issues seems to have gone away. I > will look into why an old config file can force such kind of issues. > Hmm.., my old config had "AS" as default scheduler that's why I was seeing the strange issue of RT task finishing after BE. My apologies for that. I somehow assumed that CFQ is default scheduler in my config. So I have re-run the test to see if we are still seeing the issue of loosing priority and class with-in cgroup. And we still do.. 2.6.30-rc4 with io-throttle patches =================================== Test1 ===== - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with 8MB/s BW. 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s prio 0 task finished 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s Test2 ===== - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with 8MB/s BW. 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s RT task finished Test3 ===== - Reader Starvation - I created a cgroup with BW limit of 64MB/s. First I just run the reader alone and then I run reader along with 4 writers 4 times. Reader alone 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s Reader with 4 writers --------------------- First run 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s Second run 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s Third run 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s Fourth run 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s Note that out of 64MB/s limit of this cgroup, reader does not get even 1/5 of the BW. In normal systems, readers are advantaged and reader gets its job done much faster even in presence of multiple writers. Vanilla 2.6.30-rc4 ================== Test3 ===== Reader alone 234179072 bytes (234 MB) copied, 2.52195 s, 92.9 MB/s Reader with 4 writers --------------------- First run 234179072 bytes (234 MB) copied, 4.39929 s, 53.2 MB/s Second run 234179072 bytes (234 MB) copied, 4.55929 s, 51.4 MB/s Third run 234179072 bytes (234 MB) copied, 4.79855 s, 48.8 MB/s Fourth run 234179072 bytes (234 MB) copied, 4.5069 s, 52.0 MB/s Notice, that without any writers we seem to be having BW of 92MB/s and more than 50% of that BW is still assigned to reader in presence of writers. Compare this with io-throttle cgroup of 64MB/s where reader struggles to get 10-15% of BW. So any 2nd level control will break the notion and assumptions of underlying IO scheduler. We should probably do control at IO scheduler level to make sure we don't run into such issues while getting hierarchical fair share for groups. Thanks Vivek > So now we are left with the issue of loosing the notion of priority and > class with-in cgroup. In fact on bigger systems we will probably run into > issues of kiothrottled scalability as single thread is trying to cater to > all the disks. > > If we do max bw control at IO scheduler level, then I think we should be able > to control max bw while maintaining the notion of priority and class with-in > cgroup. Also there are multiple pdflush threads and jens seems to be pushing > flusher threads per bdi which will help us achieve greater scalability and > don't have to replicate that infrastructure for kiothrottled also. > > Thanks > Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-07 15:36 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 15:36 UTC (permalink / raw) To: Andrea Righi Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Thu, May 07, 2009 at 10:45:01AM -0400, Vivek Goyal wrote: > On Thu, May 07, 2009 at 10:11:26AM -0400, Vivek Goyal wrote: > > [..] > > [root@chilli io-throttle-tests]# ./andrea-test-script.sh > > RT: 223+1 records in > > RT: 223+1 records out > > RT: 234179072 bytes (234 MB) copied, 0.988448 s, 237 MB/s > > BE: 223+1 records in > > BE: 223+1 records out > > BE: 234179072 bytes (234 MB) copied, 1.93885 s, 121 MB/s > > > > So I am still seeing the issue with differnt kind of disks also. At this point > > of time I am really not sure why I am seeing such results. > > Hold on. I think I found the culprit here. I was thinking that what is > the difference between two setups and realized that with vanilla kernels > I had done "make defconfig" and with io-throttle kernels I had used an > old config of my and did "make oldconfig". So basically config files > were differnt. > > I now used the same config file and issues seems to have gone away. I > will look into why an old config file can force such kind of issues. > Hmm.., my old config had "AS" as default scheduler that's why I was seeing the strange issue of RT task finishing after BE. My apologies for that. I somehow assumed that CFQ is default scheduler in my config. So I have re-run the test to see if we are still seeing the issue of loosing priority and class with-in cgroup. And we still do.. 2.6.30-rc4 with io-throttle patches =================================== Test1 ===== - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with 8MB/s BW. 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s prio 0 task finished 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s Test2 ===== - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with 8MB/s BW. 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s RT task finished Test3 ===== - Reader Starvation - I created a cgroup with BW limit of 64MB/s. First I just run the reader alone and then I run reader along with 4 writers 4 times. Reader alone 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s Reader with 4 writers --------------------- First run 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s Second run 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s Third run 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s Fourth run 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s Note that out of 64MB/s limit of this cgroup, reader does not get even 1/5 of the BW. In normal systems, readers are advantaged and reader gets its job done much faster even in presence of multiple writers. Vanilla 2.6.30-rc4 ================== Test3 ===== Reader alone 234179072 bytes (234 MB) copied, 2.52195 s, 92.9 MB/s Reader with 4 writers --------------------- First run 234179072 bytes (234 MB) copied, 4.39929 s, 53.2 MB/s Second run 234179072 bytes (234 MB) copied, 4.55929 s, 51.4 MB/s Third run 234179072 bytes (234 MB) copied, 4.79855 s, 48.8 MB/s Fourth run 234179072 bytes (234 MB) copied, 4.5069 s, 52.0 MB/s Notice, that without any writers we seem to be having BW of 92MB/s and more than 50% of that BW is still assigned to reader in presence of writers. Compare this with io-throttle cgroup of 64MB/s where reader struggles to get 10-15% of BW. So any 2nd level control will break the notion and assumptions of underlying IO scheduler. We should probably do control at IO scheduler level to make sure we don't run into such issues while getting hierarchical fair share for groups. Thanks Vivek > So now we are left with the issue of loosing the notion of priority and > class with-in cgroup. In fact on bigger systems we will probably run into > issues of kiothrottled scalability as single thread is trying to cater to > all the disks. > > If we do max bw control at IO scheduler level, then I think we should be able > to control max bw while maintaining the notion of priority and class with-in > cgroup. Also there are multiple pdflush threads and jens seems to be pushing > flusher threads per bdi which will help us achieve greater scalability and > don't have to replicate that infrastructure for kiothrottled also. > > Thanks > Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090507153642.GC9463-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 2009-05-07 15:36 ` Vivek Goyal @ 2009-05-07 15:42 ` Vivek Goyal -1 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 15:42 UTC (permalink / raw) To: Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Thu, May 07, 2009 at 11:36:42AM -0400, Vivek Goyal wrote: > On Thu, May 07, 2009 at 10:45:01AM -0400, Vivek Goyal wrote: > > On Thu, May 07, 2009 at 10:11:26AM -0400, Vivek Goyal wrote: > > > > [..] > > > [root@chilli io-throttle-tests]# ./andrea-test-script.sh > > > RT: 223+1 records in > > > RT: 223+1 records out > > > RT: 234179072 bytes (234 MB) copied, 0.988448 s, 237 MB/s > > > BE: 223+1 records in > > > BE: 223+1 records out > > > BE: 234179072 bytes (234 MB) copied, 1.93885 s, 121 MB/s > > > > > > So I am still seeing the issue with differnt kind of disks also. At this point > > > of time I am really not sure why I am seeing such results. > > > > Hold on. I think I found the culprit here. I was thinking that what is > > the difference between two setups and realized that with vanilla kernels > > I had done "make defconfig" and with io-throttle kernels I had used an > > old config of my and did "make oldconfig". So basically config files > > were differnt. > > > > I now used the same config file and issues seems to have gone away. I > > will look into why an old config file can force such kind of issues. > > > > Hmm.., my old config had "AS" as default scheduler that's why I was seeing > the strange issue of RT task finishing after BE. My apologies for that. I > somehow assumed that CFQ is default scheduler in my config. > > So I have re-run the test to see if we are still seeing the issue of > loosing priority and class with-in cgroup. And we still do.. > > 2.6.30-rc4 with io-throttle patches > =================================== > Test1 > ===== > - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with > 8MB/s BW. > > 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s > prio 0 task finished > 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s > > Test2 > ===== > - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with > 8MB/s BW. > > 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s > 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s > RT task finished > > Test3 > ===== > - Reader Starvation > - I created a cgroup with BW limit of 64MB/s. First I just run the reader > alone and then I run reader along with 4 writers 4 times. > > Reader alone > 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s > > Reader with 4 writers > --------------------- > First run > 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s > > Second run > 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s > > Third run > 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s > > Fourth run > 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s > > Note that out of 64MB/s limit of this cgroup, reader does not get even > 1/5 of the BW. In normal systems, readers are advantaged and reader gets > its job done much faster even in presence of multiple writers. > > Vanilla 2.6.30-rc4 > ================== > > Test3 > ===== > Reader alone > 234179072 bytes (234 MB) copied, 2.52195 s, 92.9 MB/s > > Reader with 4 writers > --------------------- > First run > 234179072 bytes (234 MB) copied, 4.39929 s, 53.2 MB/s > > Second run > 234179072 bytes (234 MB) copied, 4.55929 s, 51.4 MB/s > > Third run > 234179072 bytes (234 MB) copied, 4.79855 s, 48.8 MB/s > > Fourth run > 234179072 bytes (234 MB) copied, 4.5069 s, 52.0 MB/s > > Notice, that without any writers we seem to be having BW of 92MB/s and > more than 50% of that BW is still assigned to reader in presence of > writers. Compare this with io-throttle cgroup of 64MB/s where reader > struggles to get 10-15% of BW. > > So any 2nd level control will break the notion and assumptions of > underlying IO scheduler. We should probably do control at IO scheduler > level to make sure we don't run into such issues while getting > hierarchical fair share for groups. > Forgot to attached my reader-writer script last time. Here it is. *************************************************************** #!/bin/bash mount /dev/sdb1 /mnt/sdb mount -t cgroup -o blockio blockio /cgroup/iot/ mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 # Set bw limit of 64 MB/ps on sdb echo "/dev/sdb:$((64 * 1024 * 1024)):0:0" > /cgroup/iot/test1/blockio.bandwidth-max sync echo 3 > /proc/sys/vm/drop_caches echo $$ > /cgroup/iot/test1/tasks ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile1 bs=4K count=524288 & echo $! ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile2 bs=4K count=524288 & echo $! ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile3 bs=4K count=524288 & echo $! ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile4 bs=4K count=524288 & echo $! sleep 5 echo "Launching reader" ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & pid2=$! echo $pid2 wait $pid2 echo "Reader Finished" killall dd ********************************************************************** Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-07 15:42 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 15:42 UTC (permalink / raw) To: Andrea Righi Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Thu, May 07, 2009 at 11:36:42AM -0400, Vivek Goyal wrote: > On Thu, May 07, 2009 at 10:45:01AM -0400, Vivek Goyal wrote: > > On Thu, May 07, 2009 at 10:11:26AM -0400, Vivek Goyal wrote: > > > > [..] > > > [root@chilli io-throttle-tests]# ./andrea-test-script.sh > > > RT: 223+1 records in > > > RT: 223+1 records out > > > RT: 234179072 bytes (234 MB) copied, 0.988448 s, 237 MB/s > > > BE: 223+1 records in > > > BE: 223+1 records out > > > BE: 234179072 bytes (234 MB) copied, 1.93885 s, 121 MB/s > > > > > > So I am still seeing the issue with differnt kind of disks also. At this point > > > of time I am really not sure why I am seeing such results. > > > > Hold on. I think I found the culprit here. I was thinking that what is > > the difference between two setups and realized that with vanilla kernels > > I had done "make defconfig" and with io-throttle kernels I had used an > > old config of my and did "make oldconfig". So basically config files > > were differnt. > > > > I now used the same config file and issues seems to have gone away. I > > will look into why an old config file can force such kind of issues. > > > > Hmm.., my old config had "AS" as default scheduler that's why I was seeing > the strange issue of RT task finishing after BE. My apologies for that. I > somehow assumed that CFQ is default scheduler in my config. > > So I have re-run the test to see if we are still seeing the issue of > loosing priority and class with-in cgroup. And we still do.. > > 2.6.30-rc4 with io-throttle patches > =================================== > Test1 > ===== > - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with > 8MB/s BW. > > 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s > prio 0 task finished > 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s > > Test2 > ===== > - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with > 8MB/s BW. > > 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s > 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s > RT task finished > > Test3 > ===== > - Reader Starvation > - I created a cgroup with BW limit of 64MB/s. First I just run the reader > alone and then I run reader along with 4 writers 4 times. > > Reader alone > 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s > > Reader with 4 writers > --------------------- > First run > 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s > > Second run > 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s > > Third run > 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s > > Fourth run > 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s > > Note that out of 64MB/s limit of this cgroup, reader does not get even > 1/5 of the BW. In normal systems, readers are advantaged and reader gets > its job done much faster even in presence of multiple writers. > > Vanilla 2.6.30-rc4 > ================== > > Test3 > ===== > Reader alone > 234179072 bytes (234 MB) copied, 2.52195 s, 92.9 MB/s > > Reader with 4 writers > --------------------- > First run > 234179072 bytes (234 MB) copied, 4.39929 s, 53.2 MB/s > > Second run > 234179072 bytes (234 MB) copied, 4.55929 s, 51.4 MB/s > > Third run > 234179072 bytes (234 MB) copied, 4.79855 s, 48.8 MB/s > > Fourth run > 234179072 bytes (234 MB) copied, 4.5069 s, 52.0 MB/s > > Notice, that without any writers we seem to be having BW of 92MB/s and > more than 50% of that BW is still assigned to reader in presence of > writers. Compare this with io-throttle cgroup of 64MB/s where reader > struggles to get 10-15% of BW. > > So any 2nd level control will break the notion and assumptions of > underlying IO scheduler. We should probably do control at IO scheduler > level to make sure we don't run into such issues while getting > hierarchical fair share for groups. > Forgot to attached my reader-writer script last time. Here it is. *************************************************************** #!/bin/bash mount /dev/sdb1 /mnt/sdb mount -t cgroup -o blockio blockio /cgroup/iot/ mkdir -p /cgroup/iot/test1 /cgroup/iot/test2 # Set bw limit of 64 MB/ps on sdb echo "/dev/sdb:$((64 * 1024 * 1024)):0:0" > /cgroup/iot/test1/blockio.bandwidth-max sync echo 3 > /proc/sys/vm/drop_caches echo $$ > /cgroup/iot/test1/tasks ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile1 bs=4K count=524288 & echo $! ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile2 bs=4K count=524288 & echo $! ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile3 bs=4K count=524288 & echo $! ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile4 bs=4K count=524288 & echo $! sleep 5 echo "Launching reader" ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & pid2=$! echo $pid2 wait $pid2 echo "Reader Finished" killall dd ********************************************************************** Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090507153642.GC9463-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-07 15:42 ` Vivek Goyal @ 2009-05-07 22:19 ` Andrea Righi 1 sibling, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-07 22:19 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Thu, May 07, 2009 at 11:36:42AM -0400, Vivek Goyal wrote: > Hmm.., my old config had "AS" as default scheduler that's why I was seeing > the strange issue of RT task finishing after BE. My apologies for that. I > somehow assumed that CFQ is default scheduler in my config. ok. > > So I have re-run the test to see if we are still seeing the issue of > loosing priority and class with-in cgroup. And we still do.. > > 2.6.30-rc4 with io-throttle patches > =================================== > Test1 > ===== > - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with > 8MB/s BW. > > 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s > prio 0 task finished > 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s > > Test2 > ===== > - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with > 8MB/s BW. > > 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s > 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s > RT task finished ok, coherent with the current io-throttle implementation. > > Test3 > ===== > - Reader Starvation > - I created a cgroup with BW limit of 64MB/s. First I just run the reader > alone and then I run reader along with 4 writers 4 times. > > Reader alone > 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s > > Reader with 4 writers > --------------------- > First run > 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s > > Second run > 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s > > Third run > 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s > > Fourth run > 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s > > Note that out of 64MB/s limit of this cgroup, reader does not get even > 1/5 of the BW. In normal systems, readers are advantaged and reader gets > its job done much faster even in presence of multiple writers. And this is also coherent. The throttling is equally probable for read and write. But this shouldn't happen if we saturate the physical disk BW (doing proportional BW control or using a watermark close to 100 in io-throttle). In this case IO scheduler logic shouldn't be totally broken. Doing a very quick test with io-throttle, using a 10MB/s BW limit and blockio.watermark=90: Launching reader 256+0 records in 256+0 records out 268435456 bytes (268 MB) copied, 32.2798 s, 8.3 MB/s In the same time the writers wrote ~190MB, so the single reader got about 1/3 of the total BW. 182M testzerofile4 198M testzerofile1 188M testzerofile3 189M testzerofile2 Things are probably better with many cgroups, many readers and writers and in general the disk BW more saturated. Proportional BW approach wins in this case, because if you always use the whole disk BW the logic of the IO scheduler is still valid. > > Vanilla 2.6.30-rc4 > ================== > > Test3 > ===== > Reader alone > 234179072 bytes (234 MB) copied, 2.52195 s, 92.9 MB/s > > Reader with 4 writers > --------------------- > First run > 234179072 bytes (234 MB) copied, 4.39929 s, 53.2 MB/s > > Second run > 234179072 bytes (234 MB) copied, 4.55929 s, 51.4 MB/s > > Third run > 234179072 bytes (234 MB) copied, 4.79855 s, 48.8 MB/s > > Fourth run > 234179072 bytes (234 MB) copied, 4.5069 s, 52.0 MB/s > > Notice, that without any writers we seem to be having BW of 92MB/s and > more than 50% of that BW is still assigned to reader in presence of > writers. Compare this with io-throttle cgroup of 64MB/s where reader > struggles to get 10-15% of BW. > > So any 2nd level control will break the notion and assumptions of > underlying IO scheduler. We should probably do control at IO scheduler > level to make sure we don't run into such issues while getting > hierarchical fair share for groups. > > Thanks > Vivek > What are the results with your IO scheduler controller (if you already have them, otherwise I'll repeat this test in my system)? It seems a very interesting test to compare the advantages of the IO scheduler solution respect to the io-throttle approach. Thanks, -Andrea > > So now we are left with the issue of loosing the notion of priority and > > class with-in cgroup. In fact on bigger systems we will probably run into > issues of kiothrottled scalability as single thread is trying to cater to > > all the disks. > > > > If we do max bw control at IO scheduler level, then I think we should be able > > to control max bw while maintaining the notion of priority and class with-in > > cgroup. Also there are multiple pdflush threads and jens seems to be pushing > > flusher threads per bdi which will help us achieve greater scalability and > > don't have to replicate that infrastructure for kiothrottled also. > > > > Thanks > > Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 15:36 ` Vivek Goyal (?) (?) @ 2009-05-07 22:19 ` Andrea Righi 2009-05-08 18:09 ` Vivek Goyal 2009-05-08 18:09 ` Vivek Goyal -1 siblings, 2 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-07 22:19 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Thu, May 07, 2009 at 11:36:42AM -0400, Vivek Goyal wrote: > Hmm.., my old config had "AS" as default scheduler that's why I was seeing > the strange issue of RT task finishing after BE. My apologies for that. I > somehow assumed that CFQ is default scheduler in my config. ok. > > So I have re-run the test to see if we are still seeing the issue of > loosing priority and class with-in cgroup. And we still do.. > > 2.6.30-rc4 with io-throttle patches > =================================== > Test1 > ===== > - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with > 8MB/s BW. > > 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s > prio 0 task finished > 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s > > Test2 > ===== > - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with > 8MB/s BW. > > 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s > 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s > RT task finished ok, coherent with the current io-throttle implementation. > > Test3 > ===== > - Reader Starvation > - I created a cgroup with BW limit of 64MB/s. First I just run the reader > alone and then I run reader along with 4 writers 4 times. > > Reader alone > 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s > > Reader with 4 writers > --------------------- > First run > 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s > > Second run > 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s > > Third run > 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s > > Fourth run > 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s > > Note that out of 64MB/s limit of this cgroup, reader does not get even > 1/5 of the BW. In normal systems, readers are advantaged and reader gets > its job done much faster even in presence of multiple writers. And this is also coherent. The throttling is equally probable for read and write. But this shouldn't happen if we saturate the physical disk BW (doing proportional BW control or using a watermark close to 100 in io-throttle). In this case IO scheduler logic shouldn't be totally broken. Doing a very quick test with io-throttle, using a 10MB/s BW limit and blockio.watermark=90: Launching reader 256+0 records in 256+0 records out 268435456 bytes (268 MB) copied, 32.2798 s, 8.3 MB/s In the same time the writers wrote ~190MB, so the single reader got about 1/3 of the total BW. 182M testzerofile4 198M testzerofile1 188M testzerofile3 189M testzerofile2 Things are probably better with many cgroups, many readers and writers and in general the disk BW more saturated. Proportional BW approach wins in this case, because if you always use the whole disk BW the logic of the IO scheduler is still valid. > > Vanilla 2.6.30-rc4 > ================== > > Test3 > ===== > Reader alone > 234179072 bytes (234 MB) copied, 2.52195 s, 92.9 MB/s > > Reader with 4 writers > --------------------- > First run > 234179072 bytes (234 MB) copied, 4.39929 s, 53.2 MB/s > > Second run > 234179072 bytes (234 MB) copied, 4.55929 s, 51.4 MB/s > > Third run > 234179072 bytes (234 MB) copied, 4.79855 s, 48.8 MB/s > > Fourth run > 234179072 bytes (234 MB) copied, 4.5069 s, 52.0 MB/s > > Notice, that without any writers we seem to be having BW of 92MB/s and > more than 50% of that BW is still assigned to reader in presence of > writers. Compare this with io-throttle cgroup of 64MB/s where reader > struggles to get 10-15% of BW. > > So any 2nd level control will break the notion and assumptions of > underlying IO scheduler. We should probably do control at IO scheduler > level to make sure we don't run into such issues while getting > hierarchical fair share for groups. > > Thanks > Vivek > What are the results with your IO scheduler controller (if you already have them, otherwise I'll repeat this test in my system)? It seems a very interesting test to compare the advantages of the IO scheduler solution respect to the io-throttle approach. Thanks, -Andrea > > So now we are left with the issue of loosing the notion of priority and > > class with-in cgroup. In fact on bigger systems we will probably run into > issues of kiothrottled scalability as single thread is trying to cater to > > all the disks. > > > > If we do max bw control at IO scheduler level, then I think we should be able > > to control max bw while maintaining the notion of priority and class with-in > > cgroup. Also there are multiple pdflush threads and jens seems to be pushing > > flusher threads per bdi which will help us achieve greater scalability and > > don't have to replicate that infrastructure for kiothrottled also. > > > > Thanks > > Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 22:19 ` Andrea Righi @ 2009-05-08 18:09 ` Vivek Goyal 2009-05-08 20:05 ` Andrea Righi [not found] ` <20090508180951.GG7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-08 18:09 ` Vivek Goyal 1 sibling, 2 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-08 18:09 UTC (permalink / raw) To: Andrea Righi Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Fri, May 08, 2009 at 12:19:01AM +0200, Andrea Righi wrote: > On Thu, May 07, 2009 at 11:36:42AM -0400, Vivek Goyal wrote: > > Hmm.., my old config had "AS" as default scheduler that's why I was seeing > > the strange issue of RT task finishing after BE. My apologies for that. I > > somehow assumed that CFQ is default scheduler in my config. > > ok. > > > > > So I have re-run the test to see if we are still seeing the issue of > > loosing priority and class with-in cgroup. And we still do.. > > > > 2.6.30-rc4 with io-throttle patches > > =================================== > > Test1 > > ===== > > - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with > > 8MB/s BW. > > > > 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s > > prio 0 task finished > > 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s > > > > Test2 > > ===== > > - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with > > 8MB/s BW. > > > > 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s > > 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s > > RT task finished > > ok, coherent with the current io-throttle implementation. > > > > > Test3 > > ===== > > - Reader Starvation > > - I created a cgroup with BW limit of 64MB/s. First I just run the reader > > alone and then I run reader along with 4 writers 4 times. > > > > Reader alone > > 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s > > > > Reader with 4 writers > > --------------------- > > First run > > 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s > > > > Second run > > 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s > > > > Third run > > 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s > > > > Fourth run > > 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s > > > > Note that out of 64MB/s limit of this cgroup, reader does not get even > > 1/5 of the BW. In normal systems, readers are advantaged and reader gets > > its job done much faster even in presence of multiple writers. > > And this is also coherent. The throttling is equally probable for read > and write. But this shouldn't happen if we saturate the physical disk BW > (doing proportional BW control or using a watermark close to 100 in > io-throttle). In this case IO scheduler logic shouldn't be totally > broken. > Can you please explain the watermark a bit more? So blockio.watermark=90 mean 90% of what? total disk BW? But disk BW varies based on work load? > Doing a very quick test with io-throttle, using a 10MB/s BW limit and > blockio.watermark=90: > > Launching reader > 256+0 records in > 256+0 records out > 268435456 bytes (268 MB) copied, 32.2798 s, 8.3 MB/s > > In the same time the writers wrote ~190MB, so the single reader got > about 1/3 of the total BW. > > 182M testzerofile4 > 198M testzerofile1 > 188M testzerofile3 > 189M testzerofile2 > But its now more a max bw controller at all now? I seem to be getting the total BW of (268+182+198+188+189)/32 = 32MB/s and you set the limit to 10MB/s? [..] > What are the results with your IO scheduler controller (if you already > have them, otherwise I'll repeat this test in my system)? It seems a > very interesting test to compare the advantages of the IO scheduler > solution respect to the io-throttle approach. > I had not done any reader writer testing so far. But you forced me to run some now. :-) Here are the results. Because one is max BW controller and other is proportional BW controller doing exact comparison is hard. Still.... Test1 ===== Try to run lots of writers (50 random writers using fio and 4 sequential writers with dd if=/dev/zero) and one single reader either in root group or with in one cgroup to show that readers are not starved by writers as opposed to io-throttle controller. Run test1 with vanilla kernel with CFQ ===================================== Launched 50 fio random writers, 4 sequential writers and 1 reader in root and noted how long it takes reader to finish. Also noted the per second output from iostat -d 1 -m /dev/sdb1 to monitor how disk throughput varies. *********************************************************************** # launch 50 writers fio job fio_args="--size=64m --rw=write --numjobs=50 --group_reporting" fio $fio_args --name=test2 --directory=/mnt/sdb/fio2/ --output=/mnt/sdb/fio2/test2.log > /dev/null & #launch 4 sequential writers ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile1 bs=4K count=524288 & ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile2 bs=4K count=524288 & ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile3 bs=4K count=524288 & ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile4 bs=4K count=524288 & echo "Sleeping for 5 seconds" sleep 5 echo "Launching reader" ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & wait $! echo "Reader Finished" *************************************************************************** Results ------- 234179072 bytes (234 MB) copied, 4.55047 s, 51.5 MB/s Reader finished in 4.5 seconds. Following are few lines from iostat output *********************************************************************** Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 151.00 0.04 48.33 0 48 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 120.00 1.78 31.23 1 31 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 504.95 56.75 7.51 57 7 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 547.47 62.71 4.47 62 4 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 441.00 49.80 7.82 49 7 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 441.41 48.28 13.84 47 13 ************************************************************************* Note how, first write picks up and then suddenly reader comes in and CFQ allocates a huge chunk of BW to reader to give it the advantage. Run Test1 with IO scheduler based io controller patch ===================================================== 234179072 bytes (234 MB) copied, 5.23141 s, 44.8 MB/s Reader finishes in 5.23 seconds. Why does it take more time than CFQ, because looks like current algorithm is not punishing writers that hard. This can be fixed and not an issue. Following is some output from iostat. ********************************************************************** Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 139.60 0.04 43.83 0 44 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 227.72 16.88 29.05 17 29 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 349.00 35.04 16.06 35 16 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 339.00 34.16 21.07 34 21 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 343.56 36.68 12.54 37 12 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 378.00 38.68 19.47 38 19 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 532.00 59.06 10.00 59 10 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 125.00 2.62 38.82 2 38 ************************************************************************ Note how read throughput goes up when reader comes in. Also note that writer is still getting some decent IO done and that's why reader took little bit more time as compared to CFQ. Run Test1 with IO throttle patches ================================== Now same test is run with io-throttle patches. The only difference is that it run the test in a cgroup with max limit of 32MB/s. That should mean that effectvily we got a disk which can support at max 32MB/s of IO rate. If we look at above CFQ and io controller results, it looks like with above load we touched a peak of 70MB/s. So one can think of same test being run on a disk roughly half the speed of original disk. 234179072 bytes (234 MB) copied, 144.207 s, 1.6 MB/s Reader got a disk rate of 1.6MB/s (5 %) out of 32MB/s capacity, as opposed to the case CFQ and io scheduler controller where reader got around 70-80% of disk BW under similar work load. Test2 ===== Run test2 with io scheduler based io controller =============================================== Now run almost same test with a little difference. This time I create two cgroups of same weight 1000. I run the 50 fio random writer in one cgroup and 4 sequential writers and 1 reader in second group. This test is more to show that proportional BW IO controller is working and because of reader in group1, group2 writes are not killed (providing isolation) and secondly, reader still gets preference over the writers which are in same group. root / \ group1 group2 (50 fio writers) ( 4 writers and one reader) 234179072 bytes (234 MB) copied, 12.8546 s, 18.2 MB/s Reader finished in almost 13 seconds and got around 18MB/s. Remember when everything was in root group reader got around 45MB/s. This is to account for the fact that half of the disk is now being shared by other cgroup which are running 50 fio writes and reader can't steal the disk from them. Following is some portion of iostat output when reader became active ********************************************************************* Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 103.92 0.03 40.21 0 41 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 240.00 15.78 37.40 15 37 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 206.93 13.17 28.50 13 28 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 224.75 15.39 27.89 15 28 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 270.71 16.85 25.95 16 25 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 215.84 8.81 32.40 8 32 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 216.16 19.11 20.75 18 20 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 211.11 14.67 35.77 14 35 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 208.91 15.04 26.95 15 27 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 277.23 24.30 28.53 24 28 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 202.97 12.29 34.79 12 35 ********************************************************************** Total disk throughput is varying a lot, on an average it looks like it is getting 45MB/s. Lets say 50% of that is going to cgroup1 (fio writers), then out of rest of 22 MB/s reader seems to have to 18MB/s. These are highly approximate numbers. I think I need to come up with some kind of tool to measure per cgroup throughput (like we have for per partition stat) for more accurate comparision. But the point is that second cgroup got the isolation and read got preference with-in same cgroup. The expected behavior. Run test2 with io-throttle ========================== Same setup of two groups. The only difference is that I setup two groups with (16MB) limit. So previous 32MB limit got divided between two cgroups 50% each. - 234179072 bytes (234 MB) copied, 90.8055 s, 2.6 MB/s Reader took 90 seconds to finish. It seems to have got around 16% of available disk BW (16MB) to it. iostat output is long. Will just paste one section. ************************************************************************ [..] Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 141.58 10.16 16.12 10 16 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 174.75 8.06 12.31 7 12 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 47.52 0.12 6.16 0 6 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 82.00 0.00 31.85 0 31 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 141.00 0.00 48.07 0 48 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 72.73 0.00 26.52 0 26 *************************************************************************** Conclusion ========== It just reaffirms that with max BW control, we are not doing a fair job of throttling hence no more hold the IO scheduler properties with-in cgroup. With proportional BW controller implemented at IO scheduler level, one can do very tight integration with IO controller and hence retain IO scheduler behavior with-in cgroup. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-08 18:09 ` Vivek Goyal @ 2009-05-08 20:05 ` Andrea Righi 2009-05-08 21:56 ` Vivek Goyal [not found] ` <20090508180951.GG7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 1 sibling, 1 reply; 97+ messages in thread From: Andrea Righi @ 2009-05-08 20:05 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Fri, May 08, 2009 at 02:09:51PM -0400, Vivek Goyal wrote: > On Fri, May 08, 2009 at 12:19:01AM +0200, Andrea Righi wrote: > > On Thu, May 07, 2009 at 11:36:42AM -0400, Vivek Goyal wrote: > > > Hmm.., my old config had "AS" as default scheduler that's why I was seeing > > > the strange issue of RT task finishing after BE. My apologies for that. I > > > somehow assumed that CFQ is default scheduler in my config. > > > > ok. > > > > > > > > So I have re-run the test to see if we are still seeing the issue of > > > loosing priority and class with-in cgroup. And we still do.. > > > > > > 2.6.30-rc4 with io-throttle patches > > > =================================== > > > Test1 > > > ===== > > > - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with > > > 8MB/s BW. > > > > > > 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s > > > prio 0 task finished > > > 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s > > > > > > Test2 > > > ===== > > > - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with > > > 8MB/s BW. > > > > > > 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s > > > 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s > > > RT task finished > > > > ok, coherent with the current io-throttle implementation. > > > > > > > > Test3 > > > ===== > > > - Reader Starvation > > > - I created a cgroup with BW limit of 64MB/s. First I just run the reader > > > alone and then I run reader along with 4 writers 4 times. > > > > > > Reader alone > > > 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s > > > > > > Reader with 4 writers > > > --------------------- > > > First run > > > 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s > > > > > > Second run > > > 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s > > > > > > Third run > > > 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s > > > > > > Fourth run > > > 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s > > > > > > Note that out of 64MB/s limit of this cgroup, reader does not get even > > > 1/5 of the BW. In normal systems, readers are advantaged and reader gets > > > its job done much faster even in presence of multiple writers. > > > > And this is also coherent. The throttling is equally probable for read > > and write. But this shouldn't happen if we saturate the physical disk BW > > (doing proportional BW control or using a watermark close to 100 in > > io-throttle). In this case IO scheduler logic shouldn't be totally > > broken. > > > > Can you please explain the watermark a bit more? So blockio.watermark=90 > mean 90% of what? total disk BW? But disk BW varies based on work load? The controller starts to apply throttling rules only when the total disk BW utilization is greater than 90%. The consumed BW is evaluated as (cpu_ticks / io_ticks * 100), where cpu_ticks are the ticks (in jiffies) since the last i/o request and io_ticks is the difference of ticks accounted to a particular block device, retrieved by: part_stat_read(bdev->bd_part, io_ticks) BTW it's the same metric (%util) used by iostat. > > > Doing a very quick test with io-throttle, using a 10MB/s BW limit and > > blockio.watermark=90: > > > > Launching reader > > 256+0 records in > > 256+0 records out > > 268435456 bytes (268 MB) copied, 32.2798 s, 8.3 MB/s > > > > In the same time the writers wrote ~190MB, so the single reader got > > about 1/3 of the total BW. > > > > 182M testzerofile4 > > 198M testzerofile1 > > 188M testzerofile3 > > 189M testzerofile2 > > > > But its now more a max bw controller at all now? I seem to be getting the > total BW of (268+182+198+188+189)/32 = 32MB/s and you set the limit to > 10MB/s? > The limit of 10MB/s is applied only when the consumed disk BW hits 90%. If the disk is not fully saturated no limit is applied. It's nothing more than soft limiting, to avoid to waste the unused disk BW that we have with hard limits. This is similar to the proportional approach from a certain point of view. But ok, this only reduces the number of times that we block the IO requests. The fact is that when we apply throttling the probability to block a read or a write it's the same also in this case. > > [..] > > What are the results with your IO scheduler controller (if you already > > have them, otherwise I'll repeat this test in my system)? It seems a > > very interesting test to compare the advantages of the IO scheduler > > solution respect to the io-throttle approach. > > > > I had not done any reader writer testing so far. But you forced me to run > some now. :-) Here are the results. Good! :) > > Because one is max BW controller and other is proportional BW controller > doing exact comparison is hard. Still.... > > Test1 > ===== > Try to run lots of writers (50 random writers using fio and 4 sequential > writers with dd if=/dev/zero) and one single reader either in root group > or with in one cgroup to show that readers are not starved by writers > as opposed to io-throttle controller. > > Run test1 with vanilla kernel with CFQ > ===================================== > Launched 50 fio random writers, 4 sequential writers and 1 reader in root > and noted how long it takes reader to finish. Also noted the per second output > from iostat -d 1 -m /dev/sdb1 to monitor how disk throughput varies. > > *********************************************************************** > # launch 50 writers fio job > > fio_args="--size=64m --rw=write --numjobs=50 --group_reporting" > fio $fio_args --name=test2 --directory=/mnt/sdb/fio2/ --output=/mnt/sdb/fio2/test2.log > /dev/null & > > #launch 4 sequential writers > ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile1 bs=4K count=524288 & > ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile2 bs=4K count=524288 & > ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile3 bs=4K count=524288 & > ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile4 bs=4K count=524288 & > > echo "Sleeping for 5 seconds" > sleep 5 > echo "Launching reader" > > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > wait $! > echo "Reader Finished" > *************************************************************************** > > Results > ------- > 234179072 bytes (234 MB) copied, 4.55047 s, 51.5 MB/s > > Reader finished in 4.5 seconds. Following are few lines from iostat output > > *********************************************************************** > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 151.00 0.04 48.33 0 48 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 120.00 1.78 31.23 1 31 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 504.95 56.75 7.51 57 7 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 547.47 62.71 4.47 62 4 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 441.00 49.80 7.82 49 7 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 441.41 48.28 13.84 47 13 > > ************************************************************************* > > Note how, first write picks up and then suddenly reader comes in and CFQ > allocates a huge chunk of BW to reader to give it the advantage. > > Run Test1 with IO scheduler based io controller patch > ===================================================== > > 234179072 bytes (234 MB) copied, 5.23141 s, 44.8 MB/s > > Reader finishes in 5.23 seconds. Why does it take more time than CFQ, > because looks like current algorithm is not punishing writers that hard. > This can be fixed and not an issue. > > Following is some output from iostat. > > ********************************************************************** > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 139.60 0.04 43.83 0 44 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 227.72 16.88 29.05 17 29 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 349.00 35.04 16.06 35 16 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 339.00 34.16 21.07 34 21 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 343.56 36.68 12.54 37 12 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 378.00 38.68 19.47 38 19 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 532.00 59.06 10.00 59 10 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 125.00 2.62 38.82 2 38 > ************************************************************************ > > Note how read throughput goes up when reader comes in. Also note that > writer is still getting some decent IO done and that's why reader took > little bit more time as compared to CFQ. > > > Run Test1 with IO throttle patches > ================================== > > Now same test is run with io-throttle patches. The only difference is that > it run the test in a cgroup with max limit of 32MB/s. That should mean > that effectvily we got a disk which can support at max 32MB/s of IO rate. > If we look at above CFQ and io controller results, it looks like with > above load we touched a peak of 70MB/s. So one can think of same test > being run on a disk roughly half the speed of original disk. > > 234179072 bytes (234 MB) copied, 144.207 s, 1.6 MB/s > > Reader got a disk rate of 1.6MB/s (5 %) out of 32MB/s capacity, as opposed to > the case CFQ and io scheduler controller where reader got around 70-80% of > disk BW under similar work load. > > Test2 > ===== > Run test2 with io scheduler based io controller > =============================================== > Now run almost same test with a little difference. This time I create two > cgroups of same weight 1000. I run the 50 fio random writer in one cgroup > and 4 sequential writers and 1 reader in second group. This test is more > to show that proportional BW IO controller is working and because of > reader in group1, group2 writes are not killed (providing isolation) and > secondly, reader still gets preference over the writers which are in same > group. > > root > / \ > group1 group2 > (50 fio writers) ( 4 writers and one reader) > > 234179072 bytes (234 MB) copied, 12.8546 s, 18.2 MB/s > > Reader finished in almost 13 seconds and got around 18MB/s. Remember when > everything was in root group reader got around 45MB/s. This is to account > for the fact that half of the disk is now being shared by other cgroup > which are running 50 fio writes and reader can't steal the disk from them. > > Following is some portion of iostat output when reader became active > ********************************************************************* > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 103.92 0.03 40.21 0 41 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 240.00 15.78 37.40 15 37 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 206.93 13.17 28.50 13 28 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 224.75 15.39 27.89 15 28 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 270.71 16.85 25.95 16 25 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 215.84 8.81 32.40 8 32 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 216.16 19.11 20.75 18 20 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 211.11 14.67 35.77 14 35 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 208.91 15.04 26.95 15 27 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 277.23 24.30 28.53 24 28 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 202.97 12.29 34.79 12 35 > ********************************************************************** > > Total disk throughput is varying a lot, on an average it looks like it > is getting 45MB/s. Lets say 50% of that is going to cgroup1 (fio writers), > then out of rest of 22 MB/s reader seems to have to 18MB/s. These are > highly approximate numbers. I think I need to come up with some kind of > tool to measure per cgroup throughput (like we have for per partition > stat) for more accurate comparision. > > But the point is that second cgroup got the isolation and read got > preference with-in same cgroup. The expected behavior. > > Run test2 with io-throttle > ========================== > Same setup of two groups. The only difference is that I setup two groups > with (16MB) limit. So previous 32MB limit got divided between two cgroups > 50% each. > > - 234179072 bytes (234 MB) copied, 90.8055 s, 2.6 MB/s > > Reader took 90 seconds to finish. It seems to have got around 16% of > available disk BW (16MB) to it. > > iostat output is long. Will just paste one section. > > ************************************************************************ > [..] > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 141.58 10.16 16.12 10 16 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 174.75 8.06 12.31 7 12 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 47.52 0.12 6.16 0 6 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 82.00 0.00 31.85 0 31 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 141.00 0.00 48.07 0 48 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 72.73 0.00 26.52 0 26 > > > *************************************************************************** > > Conclusion > ========== > It just reaffirms that with max BW control, we are not doing a fair job > of throttling hence no more hold the IO scheduler properties with-in > cgroup. > > With proportional BW controller implemented at IO scheduler level, one > can do very tight integration with IO controller and hence retain > IO scheduler behavior with-in cgroup. It is worth to bug you I would say :). Results are interesting, definitely. I'll check if it's possible to merge part of the io-throttle max BW control in this controller and who knows if finally we'll be able to converge to a common proposal... Thanks, -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-08 20:05 ` Andrea Righi @ 2009-05-08 21:56 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-08 21:56 UTC (permalink / raw) To: Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Fri, May 08, 2009 at 10:05:01PM +0200, Andrea Righi wrote: [..] > > Conclusion > > ========== > > It just reaffirms that with max BW control, we are not doing a fair job > > of throttling hence no more hold the IO scheduler properties with-in > > cgroup. > > > > With proportional BW controller implemented at IO scheduler level, one > > can do very tight integration with IO controller and hence retain > > IO scheduler behavior with-in cgroup. > > It is worth to bug you I would say :). Results are interesting, > definitely. I'll check if it's possible to merge part of the io-throttle > max BW control in this controller and who knows if finally we'll be able > to converge to a common proposal... Great, Few thoughts though. - What are your requirements? Do you strictly need max bw control or proportional BW control will satisfy your needs? Or you need both? - With the current algorithm BFQ (modified WF2Q+), we should be able to do proportional BW division while maintaining the properties of IO scheduler with-in cgroup in hiearchical manner. I think it can be simply enhanced to do max bw control also. That is whenever a queue is selected for dispatch (from fairness point of view) also check the IO rate of that group and if IO rate exceeded, expire the queue immediately and fake as if queue consumed its time slice which will be equivalent to throttling. But in this simple scheme, I think throttling is still unfair with-in the class. What I mean is following. if an RT task and an BE task are in same cgroup and cgroup exceeds its max BW, RT task is next to be dispatched from fairness point of view and it will end being throttled. This is still fine because until RT task is finished, BE task will never get to run in that cgroup, so at some point of time, cgroup rate will come down and RT task will get the IO done meeting fairnesss and max bw constraints. But this simple scheme does not work with-in same class. Say prio 0 and prio 7 BE class readers. Now we will end up throttling the guy who is scheduled to go next and there is no mechanism that prio0 and prio7 tasks are throttled in proportionate manner. So, we shall have to come up with something better, I think Dhaval was implementing upper limit for cpu controller. May be PeterZ and Dhaval can give us some pointers how did they manage to implement both proportional and max bw control with the help of a single tree while maintaining the notion of prio with-in cgroup. PeterZ/Dhaval ^^^^^^^^ - We should be able to get rid of reader-writer issue even with above simple throttling mechanism for schedulers like deadline and AS, because at elevator we see it as a single queue (for both reads and writes) and we will throttle this queue. With-in queue dispatch are taken care by io scheduler. So as long as IO has been queued in the queue, scheduler will take care of giving advantage to readers even if throttling is taking place on the queue. Why am I thinking loud? So that we know what are we trying to achieve at the end of the day. So at this point of time what are the advantages/disadvantages of doing max bw control along with proportional bw control? Advantages ========== - With a combined code base, total code should be less as compared to if both of them are implemented separately. - There can be few advantages in terms of maintaining the notion of IO scheduler with-in cgroup. (like RT tasks always goes first in presence of BE and IDLE task etc. But simple throttling scheme will not take care of fair throttling with-in class. We need a better algorithm to achive that goal). - We probably will get rid of reader writer issue for single queue schedulers like deadline and AS. (Need to run tests and see). Disadvantages ============= - Implementation at IO scheduler/elevator layer does not cover higher level logical devices. So one can do max bw control only at leaf nodes where IO scheduler is running and not at intermediate logical nodes. I personally think that proportional BW control will meet more people's need as compared to max bw contorl. So far nobody has come up with a solution where a single proposal covers all the cases without breaking things. So personally, I want to make things work at least at IO scheduler level and cover as much ground as possible without breaking things (hardware RAID, all the direct attached devices etc) and then worry about higher level software devices. Thoughts? Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-08 21:56 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-08 21:56 UTC (permalink / raw) To: Andrea Righi Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Fri, May 08, 2009 at 10:05:01PM +0200, Andrea Righi wrote: [..] > > Conclusion > > ========== > > It just reaffirms that with max BW control, we are not doing a fair job > > of throttling hence no more hold the IO scheduler properties with-in > > cgroup. > > > > With proportional BW controller implemented at IO scheduler level, one > > can do very tight integration with IO controller and hence retain > > IO scheduler behavior with-in cgroup. > > It is worth to bug you I would say :). Results are interesting, > definitely. I'll check if it's possible to merge part of the io-throttle > max BW control in this controller and who knows if finally we'll be able > to converge to a common proposal... Great, Few thoughts though. - What are your requirements? Do you strictly need max bw control or proportional BW control will satisfy your needs? Or you need both? - With the current algorithm BFQ (modified WF2Q+), we should be able to do proportional BW division while maintaining the properties of IO scheduler with-in cgroup in hiearchical manner. I think it can be simply enhanced to do max bw control also. That is whenever a queue is selected for dispatch (from fairness point of view) also check the IO rate of that group and if IO rate exceeded, expire the queue immediately and fake as if queue consumed its time slice which will be equivalent to throttling. But in this simple scheme, I think throttling is still unfair with-in the class. What I mean is following. if an RT task and an BE task are in same cgroup and cgroup exceeds its max BW, RT task is next to be dispatched from fairness point of view and it will end being throttled. This is still fine because until RT task is finished, BE task will never get to run in that cgroup, so at some point of time, cgroup rate will come down and RT task will get the IO done meeting fairnesss and max bw constraints. But this simple scheme does not work with-in same class. Say prio 0 and prio 7 BE class readers. Now we will end up throttling the guy who is scheduled to go next and there is no mechanism that prio0 and prio7 tasks are throttled in proportionate manner. So, we shall have to come up with something better, I think Dhaval was implementing upper limit for cpu controller. May be PeterZ and Dhaval can give us some pointers how did they manage to implement both proportional and max bw control with the help of a single tree while maintaining the notion of prio with-in cgroup. PeterZ/Dhaval ^^^^^^^^ - We should be able to get rid of reader-writer issue even with above simple throttling mechanism for schedulers like deadline and AS, because at elevator we see it as a single queue (for both reads and writes) and we will throttle this queue. With-in queue dispatch are taken care by io scheduler. So as long as IO has been queued in the queue, scheduler will take care of giving advantage to readers even if throttling is taking place on the queue. Why am I thinking loud? So that we know what are we trying to achieve at the end of the day. So at this point of time what are the advantages/disadvantages of doing max bw control along with proportional bw control? Advantages ========== - With a combined code base, total code should be less as compared to if both of them are implemented separately. - There can be few advantages in terms of maintaining the notion of IO scheduler with-in cgroup. (like RT tasks always goes first in presence of BE and IDLE task etc. But simple throttling scheme will not take care of fair throttling with-in class. We need a better algorithm to achive that goal). - We probably will get rid of reader writer issue for single queue schedulers like deadline and AS. (Need to run tests and see). Disadvantages ============= - Implementation at IO scheduler/elevator layer does not cover higher level logical devices. So one can do max bw control only at leaf nodes where IO scheduler is running and not at intermediate logical nodes. I personally think that proportional BW control will meet more people's need as compared to max bw contorl. So far nobody has come up with a solution where a single proposal covers all the cases without breaking things. So personally, I want to make things work at least at IO scheduler level and cover as much ground as possible without breaking things (hardware RAID, all the direct attached devices etc) and then worry about higher level software devices. Thoughts? Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-08 21:56 ` Vivek Goyal (?) @ 2009-05-09 9:22 ` Peter Zijlstra -1 siblings, 0 replies; 97+ messages in thread From: Peter Zijlstra @ 2009-05-09 9:22 UTC (permalink / raw) To: Vivek Goyal Cc: Andrea Righi, Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda On Fri, 2009-05-08 at 17:56 -0400, Vivek Goyal wrote: > So, we shall have to come up with something better, I think Dhaval was > implementing upper limit for cpu controller. May be PeterZ and Dhaval can > give us some pointers how did they manage to implement both proportional > and max bw control with the help of a single tree while maintaining the > notion of prio with-in cgroup. We don't do max bandwidth control in the SCHED_OTHER bits as I oppose to making it non work conserving. SCHED_FIFO/RR do constant bandwidth things and are always scheduled in favour of SCHED_OTHER. That is, we provide a minimum bandwidth for real-time tasks, but since having a maximum higher than the minimum is useless since one cannot rely on it (non deterministic) we put max = min. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-08 21:56 ` Vivek Goyal (?) (?) @ 2009-05-14 10:31 ` Andrea Righi -1 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-14 10:31 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Fri, May 08, 2009 at 05:56:18PM -0400, Vivek Goyal wrote: > On Fri, May 08, 2009 at 10:05:01PM +0200, Andrea Righi wrote: > > [..] > > > Conclusion > > > ========== > > > It just reaffirms that with max BW control, we are not doing a fair job > > > of throttling hence no more hold the IO scheduler properties with-in > > > cgroup. > > > > > > With proportional BW controller implemented at IO scheduler level, one > > > can do very tight integration with IO controller and hence retain > > > IO scheduler behavior with-in cgroup. > > > > It is worth to bug you I would say :). Results are interesting, > > definitely. I'll check if it's possible to merge part of the io-throttle > > max BW control in this controller and who knows if finally we'll be able > > to converge to a common proposal... > > Great, Few thoughts though. > > - What are your requirements? Do you strictly need max bw control or > proportional BW control will satisfy your needs? Or you need both? The theoretical advantages of max BW control are that they offer an immediate action on policy enforcement mitigating the problem before it happens (a kind of static partitioning I would say) and that you have probably something that provides a more explicit control to contain different classes of users in hosted environment (e.g., give BW in function on how much they pay). And I can say the io-throttle approach at the moment seems to work fine for a production environment (http://www.bluehost.com). Apart the motivations above, I don't have specific requirements to provide the max BW control. But it is also true that the io-controller approach is still in a development stage and needs more testing. The design concepts make sense, definitely, so maybe only the proportional approach will be sufficient to satisfy the requirements of the 90% of users out there. -Andrea > > - With the current algorithm BFQ (modified WF2Q+), we should be able > to do proportional BW division while maintaining the properties of > IO scheduler with-in cgroup in hiearchical manner. > > I think it can be simply enhanced to do max bw control also. That is > whenever a queue is selected for dispatch (from fairness point of view) > also check the IO rate of that group and if IO rate exceeded, expire > the queue immediately and fake as if queue consumed its time slice > which will be equivalent to throttling. > > But in this simple scheme, I think throttling is still unfair with-in > the class. What I mean is following. > > if an RT task and an BE task are in same cgroup and cgroup exceeds its > max BW, RT task is next to be dispatched from fairness point of view and it > will end being throttled. This is still fine because until RT task is > finished, BE task will never get to run in that cgroup, so at some point > of time, cgroup rate will come down and RT task will get the IO done > meeting fairnesss and max bw constraints. > > But this simple scheme does not work with-in same class. Say prio 0 > and prio 7 BE class readers. Now we will end up throttling the guy who > is scheduled to go next and there is no mechanism that prio0 and prio7 > tasks are throttled in proportionate manner. > > So, we shall have to come up with something better, I think Dhaval was > implementing upper limit for cpu controller. May be PeterZ and Dhaval can > give us some pointers how did they manage to implement both proportional > and max bw control with the help of a single tree while maintaining the > notion of prio with-in cgroup. > > PeterZ/Dhaval ^^^^^^^^ > > - We should be able to get rid of reader-writer issue even with above > simple throttling mechanism for schedulers like deadline and AS, because at > elevator we see it as a single queue (for both reads and writes) and we > will throttle this queue. With-in queue dispatch are taken care by io > scheduler. So as long as IO has been queued in the queue, scheduler > will take care of giving advantage to readers even if throttling is > taking place on the queue. > > Why am I thinking loud? So that we know what are we trying to achieve at the > end of the day. So at this point of time what are the advantages/disadvantages > of doing max bw control along with proportional bw control? > > Advantages > ========== > - With a combined code base, total code should be less as compared to if > both of them are implemented separately. > > - There can be few advantages in terms of maintaining the notion of IO > scheduler with-in cgroup. (like RT tasks always goes first in presence > of BE and IDLE task etc. But simple throttling scheme will not take > care of fair throttling with-in class. We need a better algorithm to > achive that goal). > > - We probably will get rid of reader writer issue for single queue > schedulers like deadline and AS. (Need to run tests and see). > > Disadvantages > ============= > - Implementation at IO scheduler/elevator layer does not cover higher > level logical devices. So one can do max bw control only at leaf nodes > where IO scheduler is running and not at intermediate logical nodes. > > I personally think that proportional BW control will meet more people's > need as compared to max bw contorl. > > So far nobody has come up with a solution where a single proposal covers > all the cases without breaking things. So personally, I want to make > things work at least at IO scheduler level and cover as much ground as > possible without breaking things (hardware RAID, all the direct attached > devices etc) and then worry about higher level software devices. > > Thoughts? > > Thanks > Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090508215618.GJ7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090508215618.GJ7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-09 9:22 ` Peter Zijlstra 2009-05-14 10:31 ` Andrea Righi 2009-05-14 16:43 ` Dhaval Giani 2 siblings, 0 replies; 97+ messages in thread From: Peter Zijlstra @ 2009-05-09 9:22 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, Andrea Righi, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Fri, 2009-05-08 at 17:56 -0400, Vivek Goyal wrote: > So, we shall have to come up with something better, I think Dhaval was > implementing upper limit for cpu controller. May be PeterZ and Dhaval can > give us some pointers how did they manage to implement both proportional > and max bw control with the help of a single tree while maintaining the > notion of prio with-in cgroup. We don't do max bandwidth control in the SCHED_OTHER bits as I oppose to making it non work conserving. SCHED_FIFO/RR do constant bandwidth things and are always scheduled in favour of SCHED_OTHER. That is, we provide a minimum bandwidth for real-time tasks, but since having a maximum higher than the minimum is useless since one cannot rely on it (non deterministic) we put max = min. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090508215618.GJ7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-09 9:22 ` Peter Zijlstra @ 2009-05-14 10:31 ` Andrea Righi 2009-05-14 16:43 ` Dhaval Giani 2 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-14 10:31 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Fri, May 08, 2009 at 05:56:18PM -0400, Vivek Goyal wrote: > On Fri, May 08, 2009 at 10:05:01PM +0200, Andrea Righi wrote: > > [..] > > > Conclusion > > > ========== > > > It just reaffirms that with max BW control, we are not doing a fair job > > > of throttling hence no more hold the IO scheduler properties with-in > > > cgroup. > > > > > > With proportional BW controller implemented at IO scheduler level, one > > > can do very tight integration with IO controller and hence retain > > > IO scheduler behavior with-in cgroup. > > > > It is worth to bug you I would say :). Results are interesting, > > definitely. I'll check if it's possible to merge part of the io-throttle > > max BW control in this controller and who knows if finally we'll be able > > to converge to a common proposal... > > Great, Few thoughts though. > > - What are your requirements? Do you strictly need max bw control or > proportional BW control will satisfy your needs? Or you need both? The theoretical advantages of max BW control are that they offer an immediate action on policy enforcement mitigating the problem before it happens (a kind of static partitioning I would say) and that you have probably something that provides a more explicit control to contain different classes of users in hosted environment (e.g., give BW in function on how much they pay). And I can say the io-throttle approach at the moment seems to work fine for a production environment (http://www.bluehost.com). Apart the motivations above, I don't have specific requirements to provide the max BW control. But it is also true that the io-controller approach is still in a development stage and needs more testing. The design concepts make sense, definitely, so maybe only the proportional approach will be sufficient to satisfy the requirements of the 90% of users out there. -Andrea > > - With the current algorithm BFQ (modified WF2Q+), we should be able > to do proportional BW division while maintaining the properties of > IO scheduler with-in cgroup in hiearchical manner. > > I think it can be simply enhanced to do max bw control also. That is > whenever a queue is selected for dispatch (from fairness point of view) > also check the IO rate of that group and if IO rate exceeded, expire > the queue immediately and fake as if queue consumed its time slice > which will be equivalent to throttling. > > But in this simple scheme, I think throttling is still unfair with-in > the class. What I mean is following. > > if an RT task and an BE task are in same cgroup and cgroup exceeds its > max BW, RT task is next to be dispatched from fairness point of view and it > will end being throttled. This is still fine because until RT task is > finished, BE task will never get to run in that cgroup, so at some point > of time, cgroup rate will come down and RT task will get the IO done > meeting fairnesss and max bw constraints. > > But this simple scheme does not work with-in same class. Say prio 0 > and prio 7 BE class readers. Now we will end up throttling the guy who > is scheduled to go next and there is no mechanism that prio0 and prio7 > tasks are throttled in proportionate manner. > > So, we shall have to come up with something better, I think Dhaval was > implementing upper limit for cpu controller. May be PeterZ and Dhaval can > give us some pointers how did they manage to implement both proportional > and max bw control with the help of a single tree while maintaining the > notion of prio with-in cgroup. > > PeterZ/Dhaval ^^^^^^^^ > > - We should be able to get rid of reader-writer issue even with above > simple throttling mechanism for schedulers like deadline and AS, because at > elevator we see it as a single queue (for both reads and writes) and we > will throttle this queue. With-in queue dispatch are taken care by io > scheduler. So as long as IO has been queued in the queue, scheduler > will take care of giving advantage to readers even if throttling is > taking place on the queue. > > Why am I thinking loud? So that we know what are we trying to achieve at the > end of the day. So at this point of time what are the advantages/disadvantages > of doing max bw control along with proportional bw control? > > Advantages > ========== > - With a combined code base, total code should be less as compared to if > both of them are implemented separately. > > - There can be few advantages in terms of maintaining the notion of IO > scheduler with-in cgroup. (like RT tasks always goes first in presence > of BE and IDLE task etc. But simple throttling scheme will not take > care of fair throttling with-in class. We need a better algorithm to > achive that goal). > > - We probably will get rid of reader writer issue for single queue > schedulers like deadline and AS. (Need to run tests and see). > > Disadvantages > ============= > - Implementation at IO scheduler/elevator layer does not cover higher > level logical devices. So one can do max bw control only at leaf nodes > where IO scheduler is running and not at intermediate logical nodes. > > I personally think that proportional BW control will meet more people's > need as compared to max bw contorl. > > So far nobody has come up with a solution where a single proposal covers > all the cases without breaking things. So personally, I want to make > things work at least at IO scheduler level and cover as much ground as > possible without breaking things (hardware RAID, all the direct attached > devices etc) and then worry about higher level software devices. > > Thoughts? > > Thanks > Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-08 21:56 ` Vivek Goyal @ 2009-05-14 16:43 ` Dhaval Giani -1 siblings, 0 replies; 97+ messages in thread From: Dhaval Giani @ 2009-05-14 16:43 UTC (permalink / raw) To: Vivek Goyal Cc: snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, Bharata B Rao, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton, Andrea Righi On Fri, May 08, 2009 at 05:56:18PM -0400, Vivek Goyal wrote: > So, we shall have to come up with something better, I think Dhaval was > implementing upper limit for cpu controller. May be PeterZ and Dhaval can > give us some pointers how did they manage to implement both proportional > and max bw control with the help of a single tree while maintaining the > notion of prio with-in cgroup. > > PeterZ/Dhaval ^^^^^^^^ > We still haven't :). I think the idea is to keep fairness (or propotion) between the groups that are currently running. The throttled groups should not be considered. thanks, -- regards, Dhaval ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-14 16:43 ` Dhaval Giani 0 siblings, 0 replies; 97+ messages in thread From: Dhaval Giani @ 2009-05-14 16:43 UTC (permalink / raw) To: Vivek Goyal Cc: Andrea Righi, Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz, Bharata B Rao On Fri, May 08, 2009 at 05:56:18PM -0400, Vivek Goyal wrote: > So, we shall have to come up with something better, I think Dhaval was > implementing upper limit for cpu controller. May be PeterZ and Dhaval can > give us some pointers how did they manage to implement both proportional > and max bw control with the help of a single tree while maintaining the > notion of prio with-in cgroup. > > PeterZ/Dhaval ^^^^^^^^ > We still haven't :). I think the idea is to keep fairness (or propotion) between the groups that are currently running. The throttled groups should not be considered. thanks, -- regards, Dhaval ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090508180951.GG7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090508180951.GG7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-08 20:05 ` Andrea Righi 0 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-08 20:05 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Fri, May 08, 2009 at 02:09:51PM -0400, Vivek Goyal wrote: > On Fri, May 08, 2009 at 12:19:01AM +0200, Andrea Righi wrote: > > On Thu, May 07, 2009 at 11:36:42AM -0400, Vivek Goyal wrote: > > > Hmm.., my old config had "AS" as default scheduler that's why I was seeing > > > the strange issue of RT task finishing after BE. My apologies for that. I > > > somehow assumed that CFQ is default scheduler in my config. > > > > ok. > > > > > > > > So I have re-run the test to see if we are still seeing the issue of > > > loosing priority and class with-in cgroup. And we still do.. > > > > > > 2.6.30-rc4 with io-throttle patches > > > =================================== > > > Test1 > > > ===== > > > - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with > > > 8MB/s BW. > > > > > > 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s > > > prio 0 task finished > > > 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s > > > > > > Test2 > > > ===== > > > - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with > > > 8MB/s BW. > > > > > > 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s > > > 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s > > > RT task finished > > > > ok, coherent with the current io-throttle implementation. > > > > > > > > Test3 > > > ===== > > > - Reader Starvation > > > - I created a cgroup with BW limit of 64MB/s. First I just run the reader > > > alone and then I run reader along with 4 writers 4 times. > > > > > > Reader alone > > > 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s > > > > > > Reader with 4 writers > > > --------------------- > > > First run > > > 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s > > > > > > Second run > > > 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s > > > > > > Third run > > > 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s > > > > > > Fourth run > > > 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s > > > > > > Note that out of 64MB/s limit of this cgroup, reader does not get even > > > 1/5 of the BW. In normal systems, readers are advantaged and reader gets > > > its job done much faster even in presence of multiple writers. > > > > And this is also coherent. The throttling is equally probable for read > > and write. But this shouldn't happen if we saturate the physical disk BW > > (doing proportional BW control or using a watermark close to 100 in > > io-throttle). In this case IO scheduler logic shouldn't be totally > > broken. > > > > Can you please explain the watermark a bit more? So blockio.watermark=90 > mean 90% of what? total disk BW? But disk BW varies based on work load? The controller starts to apply throttling rules only when the total disk BW utilization is greater than 90%. The consumed BW is evaluated as (cpu_ticks / io_ticks * 100), where cpu_ticks are the ticks (in jiffies) since the last i/o request and io_ticks is the difference of ticks accounted to a particular block device, retrieved by: part_stat_read(bdev->bd_part, io_ticks) BTW it's the same metric (%util) used by iostat. > > > Doing a very quick test with io-throttle, using a 10MB/s BW limit and > > blockio.watermark=90: > > > > Launching reader > > 256+0 records in > > 256+0 records out > > 268435456 bytes (268 MB) copied, 32.2798 s, 8.3 MB/s > > > > In the same time the writers wrote ~190MB, so the single reader got > > about 1/3 of the total BW. > > > > 182M testzerofile4 > > 198M testzerofile1 > > 188M testzerofile3 > > 189M testzerofile2 > > > > But its now more a max bw controller at all now? I seem to be getting the > total BW of (268+182+198+188+189)/32 = 32MB/s and you set the limit to > 10MB/s? > The limit of 10MB/s is applied only when the consumed disk BW hits 90%. If the disk is not fully saturated no limit is applied. It's nothing more than soft limiting, to avoid to waste the unused disk BW that we have with hard limits. This is similar to the proportional approach from a certain point of view. But ok, this only reduces the number of times that we block the IO requests. The fact is that when we apply throttling the probability to block a read or a write it's the same also in this case. > > [..] > > What are the results with your IO scheduler controller (if you already > > have them, otherwise I'll repeat this test in my system)? It seems a > > very interesting test to compare the advantages of the IO scheduler > > solution respect to the io-throttle approach. > > > > I had not done any reader writer testing so far. But you forced me to run > some now. :-) Here are the results. Good! :) > > Because one is max BW controller and other is proportional BW controller > doing exact comparison is hard. Still.... > > Test1 > ===== > Try to run lots of writers (50 random writers using fio and 4 sequential > writers with dd if=/dev/zero) and one single reader either in root group > or with in one cgroup to show that readers are not starved by writers > as opposed to io-throttle controller. > > Run test1 with vanilla kernel with CFQ > ===================================== > Launched 50 fio random writers, 4 sequential writers and 1 reader in root > and noted how long it takes reader to finish. Also noted the per second output > from iostat -d 1 -m /dev/sdb1 to monitor how disk throughput varies. > > *********************************************************************** > # launch 50 writers fio job > > fio_args="--size=64m --rw=write --numjobs=50 --group_reporting" > fio $fio_args --name=test2 --directory=/mnt/sdb/fio2/ --output=/mnt/sdb/fio2/test2.log > /dev/null & > > #launch 4 sequential writers > ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile1 bs=4K count=524288 & > ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile2 bs=4K count=524288 & > ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile3 bs=4K count=524288 & > ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile4 bs=4K count=524288 & > > echo "Sleeping for 5 seconds" > sleep 5 > echo "Launching reader" > > ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & > wait $! > echo "Reader Finished" > *************************************************************************** > > Results > ------- > 234179072 bytes (234 MB) copied, 4.55047 s, 51.5 MB/s > > Reader finished in 4.5 seconds. Following are few lines from iostat output > > *********************************************************************** > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 151.00 0.04 48.33 0 48 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 120.00 1.78 31.23 1 31 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 504.95 56.75 7.51 57 7 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 547.47 62.71 4.47 62 4 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 441.00 49.80 7.82 49 7 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 441.41 48.28 13.84 47 13 > > ************************************************************************* > > Note how, first write picks up and then suddenly reader comes in and CFQ > allocates a huge chunk of BW to reader to give it the advantage. > > Run Test1 with IO scheduler based io controller patch > ===================================================== > > 234179072 bytes (234 MB) copied, 5.23141 s, 44.8 MB/s > > Reader finishes in 5.23 seconds. Why does it take more time than CFQ, > because looks like current algorithm is not punishing writers that hard. > This can be fixed and not an issue. > > Following is some output from iostat. > > ********************************************************************** > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 139.60 0.04 43.83 0 44 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 227.72 16.88 29.05 17 29 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 349.00 35.04 16.06 35 16 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 339.00 34.16 21.07 34 21 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 343.56 36.68 12.54 37 12 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 378.00 38.68 19.47 38 19 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 532.00 59.06 10.00 59 10 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 125.00 2.62 38.82 2 38 > ************************************************************************ > > Note how read throughput goes up when reader comes in. Also note that > writer is still getting some decent IO done and that's why reader took > little bit more time as compared to CFQ. > > > Run Test1 with IO throttle patches > ================================== > > Now same test is run with io-throttle patches. The only difference is that > it run the test in a cgroup with max limit of 32MB/s. That should mean > that effectvily we got a disk which can support at max 32MB/s of IO rate. > If we look at above CFQ and io controller results, it looks like with > above load we touched a peak of 70MB/s. So one can think of same test > being run on a disk roughly half the speed of original disk. > > 234179072 bytes (234 MB) copied, 144.207 s, 1.6 MB/s > > Reader got a disk rate of 1.6MB/s (5 %) out of 32MB/s capacity, as opposed to > the case CFQ and io scheduler controller where reader got around 70-80% of > disk BW under similar work load. > > Test2 > ===== > Run test2 with io scheduler based io controller > =============================================== > Now run almost same test with a little difference. This time I create two > cgroups of same weight 1000. I run the 50 fio random writer in one cgroup > and 4 sequential writers and 1 reader in second group. This test is more > to show that proportional BW IO controller is working and because of > reader in group1, group2 writes are not killed (providing isolation) and > secondly, reader still gets preference over the writers which are in same > group. > > root > / \ > group1 group2 > (50 fio writers) ( 4 writers and one reader) > > 234179072 bytes (234 MB) copied, 12.8546 s, 18.2 MB/s > > Reader finished in almost 13 seconds and got around 18MB/s. Remember when > everything was in root group reader got around 45MB/s. This is to account > for the fact that half of the disk is now being shared by other cgroup > which are running 50 fio writes and reader can't steal the disk from them. > > Following is some portion of iostat output when reader became active > ********************************************************************* > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 103.92 0.03 40.21 0 41 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 240.00 15.78 37.40 15 37 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 206.93 13.17 28.50 13 28 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 224.75 15.39 27.89 15 28 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 270.71 16.85 25.95 16 25 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 215.84 8.81 32.40 8 32 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 216.16 19.11 20.75 18 20 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 211.11 14.67 35.77 14 35 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 208.91 15.04 26.95 15 27 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 277.23 24.30 28.53 24 28 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 202.97 12.29 34.79 12 35 > ********************************************************************** > > Total disk throughput is varying a lot, on an average it looks like it > is getting 45MB/s. Lets say 50% of that is going to cgroup1 (fio writers), > then out of rest of 22 MB/s reader seems to have to 18MB/s. These are > highly approximate numbers. I think I need to come up with some kind of > tool to measure per cgroup throughput (like we have for per partition > stat) for more accurate comparision. > > But the point is that second cgroup got the isolation and read got > preference with-in same cgroup. The expected behavior. > > Run test2 with io-throttle > ========================== > Same setup of two groups. The only difference is that I setup two groups > with (16MB) limit. So previous 32MB limit got divided between two cgroups > 50% each. > > - 234179072 bytes (234 MB) copied, 90.8055 s, 2.6 MB/s > > Reader took 90 seconds to finish. It seems to have got around 16% of > available disk BW (16MB) to it. > > iostat output is long. Will just paste one section. > > ************************************************************************ > [..] > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 141.58 10.16 16.12 10 16 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 174.75 8.06 12.31 7 12 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 47.52 0.12 6.16 0 6 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 82.00 0.00 31.85 0 31 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 141.00 0.00 48.07 0 48 > > Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn > sdb1 72.73 0.00 26.52 0 26 > > > *************************************************************************** > > Conclusion > ========== > It just reaffirms that with max BW control, we are not doing a fair job > of throttling hence no more hold the IO scheduler properties with-in > cgroup. > > With proportional BW controller implemented at IO scheduler level, one > can do very tight integration with IO controller and hence retain > IO scheduler behavior with-in cgroup. It is worth to bug you I would say :). Results are interesting, definitely. I'll check if it's possible to merge part of the io-throttle max BW control in this controller and who knows if finally we'll be able to converge to a common proposal... Thanks, -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 22:19 ` Andrea Righi 2009-05-08 18:09 ` Vivek Goyal @ 2009-05-08 18:09 ` Vivek Goyal 1 sibling, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-08 18:09 UTC (permalink / raw) To: Andrea Righi Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Fri, May 08, 2009 at 12:19:01AM +0200, Andrea Righi wrote: > On Thu, May 07, 2009 at 11:36:42AM -0400, Vivek Goyal wrote: > > Hmm.., my old config had "AS" as default scheduler that's why I was seeing > > the strange issue of RT task finishing after BE. My apologies for that. I > > somehow assumed that CFQ is default scheduler in my config. > > ok. > > > > > So I have re-run the test to see if we are still seeing the issue of > > loosing priority and class with-in cgroup. And we still do.. > > > > 2.6.30-rc4 with io-throttle patches > > =================================== > > Test1 > > ===== > > - Two readers, one BE prio 0 and other BE prio 7 in a cgroup limited with > > 8MB/s BW. > > > > 234179072 bytes (234 MB) copied, 55.8448 s, 4.2 MB/s > > prio 0 task finished > > 234179072 bytes (234 MB) copied, 55.8878 s, 4.2 MB/s > > > > Test2 > > ===== > > - Two readers, one RT prio 0 and other BE prio 7 in a cgroup limited with > > 8MB/s BW. > > > > 234179072 bytes (234 MB) copied, 55.8876 s, 4.2 MB/s > > 234179072 bytes (234 MB) copied, 55.8984 s, 4.2 MB/s > > RT task finished > > ok, coherent with the current io-throttle implementation. > > > > > Test3 > > ===== > > - Reader Starvation > > - I created a cgroup with BW limit of 64MB/s. First I just run the reader > > alone and then I run reader along with 4 writers 4 times. > > > > Reader alone > > 234179072 bytes (234 MB) copied, 3.71796 s, 63.0 MB/s > > > > Reader with 4 writers > > --------------------- > > First run > > 234179072 bytes (234 MB) copied, 30.394 s, 7.7 MB/s > > > > Second run > > 234179072 bytes (234 MB) copied, 26.9607 s, 8.7 MB/s > > > > Third run > > 234179072 bytes (234 MB) copied, 37.3515 s, 6.3 MB/s > > > > Fourth run > > 234179072 bytes (234 MB) copied, 36.817 s, 6.4 MB/s > > > > Note that out of 64MB/s limit of this cgroup, reader does not get even > > 1/5 of the BW. In normal systems, readers are advantaged and reader gets > > its job done much faster even in presence of multiple writers. > > And this is also coherent. The throttling is equally probable for read > and write. But this shouldn't happen if we saturate the physical disk BW > (doing proportional BW control or using a watermark close to 100 in > io-throttle). In this case IO scheduler logic shouldn't be totally > broken. > Can you please explain the watermark a bit more? So blockio.watermark=90 mean 90% of what? total disk BW? But disk BW varies based on work load? > Doing a very quick test with io-throttle, using a 10MB/s BW limit and > blockio.watermark=90: > > Launching reader > 256+0 records in > 256+0 records out > 268435456 bytes (268 MB) copied, 32.2798 s, 8.3 MB/s > > In the same time the writers wrote ~190MB, so the single reader got > about 1/3 of the total BW. > > 182M testzerofile4 > 198M testzerofile1 > 188M testzerofile3 > 189M testzerofile2 > But its now more a max bw controller at all now? I seem to be getting the total BW of (268+182+198+188+189)/32 = 32MB/s and you set the limit to 10MB/s? [..] > What are the results with your IO scheduler controller (if you already > have them, otherwise I'll repeat this test in my system)? It seems a > very interesting test to compare the advantages of the IO scheduler > solution respect to the io-throttle approach. > I had not done any reader writer testing so far. But you forced me to run some now. :-) Here are the results. Because one is max BW controller and other is proportional BW controller doing exact comparison is hard. Still.... Test1 ===== Try to run lots of writers (50 random writers using fio and 4 sequential writers with dd if=/dev/zero) and one single reader either in root group or with in one cgroup to show that readers are not starved by writers as opposed to io-throttle controller. Run test1 with vanilla kernel with CFQ ===================================== Launched 50 fio random writers, 4 sequential writers and 1 reader in root and noted how long it takes reader to finish. Also noted the per second output from iostat -d 1 -m /dev/sdb1 to monitor how disk throughput varies. *********************************************************************** # launch 50 writers fio job fio_args="--size=64m --rw=write --numjobs=50 --group_reporting" fio $fio_args --name=test2 --directory=/mnt/sdb/fio2/ --output=/mnt/sdb/fio2/test2.log > /dev/null & #launch 4 sequential writers ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile1 bs=4K count=524288 & ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile2 bs=4K count=524288 & ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile3 bs=4K count=524288 & ionice -c 2 -n 7 dd if=/dev/zero of=/mnt/sdb/testzerofile4 bs=4K count=524288 & echo "Sleeping for 5 seconds" sleep 5 echo "Launching reader" ionice -c 2 -n 0 dd if=/mnt/sdb/zerofile2 of=/dev/zero & wait $! echo "Reader Finished" *************************************************************************** Results ------- 234179072 bytes (234 MB) copied, 4.55047 s, 51.5 MB/s Reader finished in 4.5 seconds. Following are few lines from iostat output *********************************************************************** Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 151.00 0.04 48.33 0 48 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 120.00 1.78 31.23 1 31 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 504.95 56.75 7.51 57 7 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 547.47 62.71 4.47 62 4 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 441.00 49.80 7.82 49 7 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 441.41 48.28 13.84 47 13 ************************************************************************* Note how, first write picks up and then suddenly reader comes in and CFQ allocates a huge chunk of BW to reader to give it the advantage. Run Test1 with IO scheduler based io controller patch ===================================================== 234179072 bytes (234 MB) copied, 5.23141 s, 44.8 MB/s Reader finishes in 5.23 seconds. Why does it take more time than CFQ, because looks like current algorithm is not punishing writers that hard. This can be fixed and not an issue. Following is some output from iostat. ********************************************************************** Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 139.60 0.04 43.83 0 44 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 227.72 16.88 29.05 17 29 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 349.00 35.04 16.06 35 16 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 339.00 34.16 21.07 34 21 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 343.56 36.68 12.54 37 12 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 378.00 38.68 19.47 38 19 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 532.00 59.06 10.00 59 10 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 125.00 2.62 38.82 2 38 ************************************************************************ Note how read throughput goes up when reader comes in. Also note that writer is still getting some decent IO done and that's why reader took little bit more time as compared to CFQ. Run Test1 with IO throttle patches ================================== Now same test is run with io-throttle patches. The only difference is that it run the test in a cgroup with max limit of 32MB/s. That should mean that effectvily we got a disk which can support at max 32MB/s of IO rate. If we look at above CFQ and io controller results, it looks like with above load we touched a peak of 70MB/s. So one can think of same test being run on a disk roughly half the speed of original disk. 234179072 bytes (234 MB) copied, 144.207 s, 1.6 MB/s Reader got a disk rate of 1.6MB/s (5 %) out of 32MB/s capacity, as opposed to the case CFQ and io scheduler controller where reader got around 70-80% of disk BW under similar work load. Test2 ===== Run test2 with io scheduler based io controller =============================================== Now run almost same test with a little difference. This time I create two cgroups of same weight 1000. I run the 50 fio random writer in one cgroup and 4 sequential writers and 1 reader in second group. This test is more to show that proportional BW IO controller is working and because of reader in group1, group2 writes are not killed (providing isolation) and secondly, reader still gets preference over the writers which are in same group. root / \ group1 group2 (50 fio writers) ( 4 writers and one reader) 234179072 bytes (234 MB) copied, 12.8546 s, 18.2 MB/s Reader finished in almost 13 seconds and got around 18MB/s. Remember when everything was in root group reader got around 45MB/s. This is to account for the fact that half of the disk is now being shared by other cgroup which are running 50 fio writes and reader can't steal the disk from them. Following is some portion of iostat output when reader became active ********************************************************************* Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 103.92 0.03 40.21 0 41 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 240.00 15.78 37.40 15 37 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 206.93 13.17 28.50 13 28 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 224.75 15.39 27.89 15 28 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 270.71 16.85 25.95 16 25 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 215.84 8.81 32.40 8 32 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 216.16 19.11 20.75 18 20 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 211.11 14.67 35.77 14 35 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 208.91 15.04 26.95 15 27 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 277.23 24.30 28.53 24 28 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 202.97 12.29 34.79 12 35 ********************************************************************** Total disk throughput is varying a lot, on an average it looks like it is getting 45MB/s. Lets say 50% of that is going to cgroup1 (fio writers), then out of rest of 22 MB/s reader seems to have to 18MB/s. These are highly approximate numbers. I think I need to come up with some kind of tool to measure per cgroup throughput (like we have for per partition stat) for more accurate comparision. But the point is that second cgroup got the isolation and read got preference with-in same cgroup. The expected behavior. Run test2 with io-throttle ========================== Same setup of two groups. The only difference is that I setup two groups with (16MB) limit. So previous 32MB limit got divided between two cgroups 50% each. - 234179072 bytes (234 MB) copied, 90.8055 s, 2.6 MB/s Reader took 90 seconds to finish. It seems to have got around 16% of available disk BW (16MB) to it. iostat output is long. Will just paste one section. ************************************************************************ [..] Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 141.58 10.16 16.12 10 16 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 174.75 8.06 12.31 7 12 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 47.52 0.12 6.16 0 6 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 82.00 0.00 31.85 0 31 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 141.00 0.00 48.07 0 48 Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn sdb1 72.73 0.00 26.52 0 26 *************************************************************************** Conclusion ========== It just reaffirms that with max BW control, we are not doing a fair job of throttling hence no more hold the IO scheduler properties with-in cgroup. With proportional BW controller implemented at IO scheduler level, one can do very tight integration with IO controller and hence retain IO scheduler behavior with-in cgroup. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090507144501.GB9463-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-07 15:36 ` Vivek Goyal @ 2009-05-07 22:40 ` Andrea Righi 1 sibling, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-07 22:40 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, Andrew Morton On Thu, May 07, 2009 at 10:45:01AM -0400, Vivek Goyal wrote: > So now we are left with the issue of loosing the notion of priority and > class with-in cgroup. In fact on bigger systems we will probably run into > issues of kiothrottled scalability as single thread is trying to cater to > all the disks. > > If we do max bw control at IO scheduler level, then I think we should be able > to control max bw while maintaining the notion of priority and class with-in > cgroup. Also there are multiple pdflush threads and jens seems to be pushing > flusher threads per bdi which will help us achieve greater scalability and > don't have to replicate that infrastructure for kiothrottled also. There's a lot of room for improvements and optimizations in the kiothrottled part, obviously the single-threaded approach is not a definitive solutions. Flusher threads are probably a good solution. But I don't think we need to replicate the pdflush replacement infrastructure for throttled writeback IO. Instead it could be just integrated with the flusher threads, i.e. activate flusher threads only when the request needs to be written to disk according to the dirty memory limit and IO BW limits. I mean, I don't see any critical problem for this part. Instead, preserving the IO priority and IO scheduler logic inside cgroups seems a more critical issue to me. And I'm quite convinced that the right approach for this is to operate at the IO scheduler, but I'm still a little bit skeptical that only operating at the IO scheduler level would resolve all our problems. -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 14:45 ` Vivek Goyal (?) (?) @ 2009-05-07 22:40 ` Andrea Righi -1 siblings, 0 replies; 97+ messages in thread From: Andrea Righi @ 2009-05-07 22:40 UTC (permalink / raw) To: Vivek Goyal Cc: Andrew Morton, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, agk, dm-devel, snitzer, m-ikeda, peterz On Thu, May 07, 2009 at 10:45:01AM -0400, Vivek Goyal wrote: > So now we are left with the issue of loosing the notion of priority and > class with-in cgroup. In fact on bigger systems we will probably run into > issues of kiothrottled scalability as single thread is trying to cater to > all the disks. > > If we do max bw control at IO scheduler level, then I think we should be able > to control max bw while maintaining the notion of priority and class with-in > cgroup. Also there are multiple pdflush threads and jens seems to be pushing > flusher threads per bdi which will help us achieve greater scalability and > don't have to replicate that infrastructure for kiothrottled also. There's a lot of room for improvements and optimizations in the kiothrottled part, obviously the single-threaded approach is not a definitive solutions. Flusher threads are probably a good solution. But I don't think we need to replicate the pdflush replacement infrastructure for throttled writeback IO. Instead it could be just integrated with the flusher threads, i.e. activate flusher threads only when the request needs to be written to disk according to the dirty memory limit and IO BW limits. I mean, I don't see any critical problem for this part. Instead, preserving the IO priority and IO scheduler logic inside cgroups seems a more critical issue to me. And I'm quite convinced that the right approach for this is to operate at the IO scheduler, but I'm still a little bit skeptical that only operating at the IO scheduler level would resolve all our problems. -Andrea ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 2:33 ` Vivek Goyal ` (3 preceding siblings ...) 2009-05-06 20:32 ` Vivek Goyal @ 2009-05-07 0:18 ` Ryo Tsuruta [not found] ` <20090507.091858.226775723.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org> 2009-05-08 14:24 ` Rik van Riel 4 siblings, 2 replies; 97+ messages in thread From: Ryo Tsuruta @ 2009-05-07 0:18 UTC (permalink / raw) To: vgoyal Cc: akpm, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, peterz Hi Vivek, > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > of FIFO dispatch of buffered bios. Apart from that it tries to provide > fairness in terms of actual IO done and that would mean a seeky workload > will can use disk for much longer to get equivalent IO done and slow down > other applications. Implementing IO controller at IO scheduler level gives > us tigher control. Will it not meet your requirements? If you got specific > concerns with IO scheduler based contol patches, please highlight these and > we will see how these can be addressed. I'd like to avoid making complicated existing IO schedulers and other kernel codes and to give a choice to users whether or not to use it. I know that you chose an approach that using compile time options to get the same behavior as old system, but device-mapper drivers can be added, removed and replaced while system is running. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090507.091858.226775723.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 2009-05-07 0:18 ` Ryo Tsuruta @ 2009-05-07 1:25 ` Vivek Goyal 2009-05-08 14:24 ` Rik van Riel 1 sibling, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 1:25 UTC (permalink / raw) To: Ryo Tsuruta Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w On Thu, May 07, 2009 at 09:18:58AM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > > of FIFO dispatch of buffered bios. Apart from that it tries to provide > > fairness in terms of actual IO done and that would mean a seeky workload > > will can use disk for much longer to get equivalent IO done and slow down > > other applications. Implementing IO controller at IO scheduler level gives > > us tigher control. Will it not meet your requirements? If you got specific > > concerns with IO scheduler based contol patches, please highlight these and > > we will see how these can be addressed. > > I'd like to avoid making complicated existing IO schedulers and other > kernel codes and to give a choice to users whether or not to use it. > I know that you chose an approach that using compile time options to > get the same behavior as old system, but device-mapper drivers can be > added, removed and replaced while system is running. > Same is possible with IO scheduler based controller. If you don't want cgroup stuff, don't create those. By default everything will be in root group and you will get the old behavior. If you want io controller stuff, just create the cgroup, assign weight and move task there. So what more choices do you want which are missing here? Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-07 1:25 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-07 1:25 UTC (permalink / raw) To: Ryo Tsuruta Cc: akpm, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, peterz On Thu, May 07, 2009 at 09:18:58AM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > > of FIFO dispatch of buffered bios. Apart from that it tries to provide > > fairness in terms of actual IO done and that would mean a seeky workload > > will can use disk for much longer to get equivalent IO done and slow down > > other applications. Implementing IO controller at IO scheduler level gives > > us tigher control. Will it not meet your requirements? If you got specific > > concerns with IO scheduler based contol patches, please highlight these and > > we will see how these can be addressed. > > I'd like to avoid making complicated existing IO schedulers and other > kernel codes and to give a choice to users whether or not to use it. > I know that you chose an approach that using compile time options to > get the same behavior as old system, but device-mapper drivers can be > added, removed and replaced while system is running. > Same is possible with IO scheduler based controller. If you don't want cgroup stuff, don't create those. By default everything will be in root group and you will get the old behavior. If you want io controller stuff, just create the cgroup, assign weight and move task there. So what more choices do you want which are missing here? Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090507012559.GC4187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090507012559.GC4187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-11 11:23 ` Ryo Tsuruta 0 siblings, 0 replies; 97+ messages in thread From: Ryo Tsuruta @ 2009-05-11 11:23 UTC (permalink / raw) To: vgoyal-H+wXaHxf7aLQT0dZR+AlfA Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w Hi Vivek, From: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Subject: Re: IO scheduler based IO Controller V2 Date: Wed, 6 May 2009 21:25:59 -0400 > On Thu, May 07, 2009 at 09:18:58AM +0900, Ryo Tsuruta wrote: > > Hi Vivek, > > > > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > > > of FIFO dispatch of buffered bios. Apart from that it tries to provide > > > fairness in terms of actual IO done and that would mean a seeky workload > > > will can use disk for much longer to get equivalent IO done and slow down > > > other applications. Implementing IO controller at IO scheduler level gives > > > us tigher control. Will it not meet your requirements? If you got specific > > > concerns with IO scheduler based contol patches, please highlight these and > > > we will see how these can be addressed. > > > > I'd like to avoid making complicated existing IO schedulers and other > > kernel codes and to give a choice to users whether or not to use it. > > I know that you chose an approach that using compile time options to > > get the same behavior as old system, but device-mapper drivers can be > > added, removed and replaced while system is running. > > > > Same is possible with IO scheduler based controller. If you don't want > cgroup stuff, don't create those. By default everything will be in root > group and you will get the old behavior. > > If you want io controller stuff, just create the cgroup, assign weight > and move task there. So what more choices do you want which are missing > here? What I mean to say is that device-mapper drivers can be completely removed from the kernel if not used. I know that dm-ioband has some issues which can be addressed by your IO controller, but I'm not sure your controller works well. So I would like to see some benchmark results of your IO controller. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 1:25 ` Vivek Goyal (?) (?) @ 2009-05-11 11:23 ` Ryo Tsuruta [not found] ` <20090511.202309.112614168.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org> -1 siblings, 1 reply; 97+ messages in thread From: Ryo Tsuruta @ 2009-05-11 11:23 UTC (permalink / raw) To: vgoyal Cc: akpm, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, peterz Hi Vivek, From: Vivek Goyal <vgoyal@redhat.com> Subject: Re: IO scheduler based IO Controller V2 Date: Wed, 6 May 2009 21:25:59 -0400 > On Thu, May 07, 2009 at 09:18:58AM +0900, Ryo Tsuruta wrote: > > Hi Vivek, > > > > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > > > of FIFO dispatch of buffered bios. Apart from that it tries to provide > > > fairness in terms of actual IO done and that would mean a seeky workload > > > will can use disk for much longer to get equivalent IO done and slow down > > > other applications. Implementing IO controller at IO scheduler level gives > > > us tigher control. Will it not meet your requirements? If you got specific > > > concerns with IO scheduler based contol patches, please highlight these and > > > we will see how these can be addressed. > > > > I'd like to avoid making complicated existing IO schedulers and other > > kernel codes and to give a choice to users whether or not to use it. > > I know that you chose an approach that using compile time options to > > get the same behavior as old system, but device-mapper drivers can be > > added, removed and replaced while system is running. > > > > Same is possible with IO scheduler based controller. If you don't want > cgroup stuff, don't create those. By default everything will be in root > group and you will get the old behavior. > > If you want io controller stuff, just create the cgroup, assign weight > and move task there. So what more choices do you want which are missing > here? What I mean to say is that device-mapper drivers can be completely removed from the kernel if not used. I know that dm-ioband has some issues which can be addressed by your IO controller, but I'm not sure your controller works well. So I would like to see some benchmark results of your IO controller. Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090511.202309.112614168.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 2009-05-11 11:23 ` Ryo Tsuruta @ 2009-05-11 12:49 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-11 12:49 UTC (permalink / raw) To: Ryo Tsuruta Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w On Mon, May 11, 2009 at 08:23:09PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > From: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> > Subject: Re: IO scheduler based IO Controller V2 > Date: Wed, 6 May 2009 21:25:59 -0400 > > > On Thu, May 07, 2009 at 09:18:58AM +0900, Ryo Tsuruta wrote: > > > Hi Vivek, > > > > > > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > > > > of FIFO dispatch of buffered bios. Apart from that it tries to provide > > > > fairness in terms of actual IO done and that would mean a seeky workload > > > > will can use disk for much longer to get equivalent IO done and slow down > > > > other applications. Implementing IO controller at IO scheduler level gives > > > > us tigher control. Will it not meet your requirements? If you got specific > > > > concerns with IO scheduler based contol patches, please highlight these and > > > > we will see how these can be addressed. > > > > > > I'd like to avoid making complicated existing IO schedulers and other > > > kernel codes and to give a choice to users whether or not to use it. > > > I know that you chose an approach that using compile time options to > > > get the same behavior as old system, but device-mapper drivers can be > > > added, removed and replaced while system is running. > > > > > > > Same is possible with IO scheduler based controller. If you don't want > > cgroup stuff, don't create those. By default everything will be in root > > group and you will get the old behavior. > > > > If you want io controller stuff, just create the cgroup, assign weight > > and move task there. So what more choices do you want which are missing > > here? > > What I mean to say is that device-mapper drivers can be completely > removed from the kernel if not used. > > I know that dm-ioband has some issues which can be addressed by your > IO controller, but I'm not sure your controller works well. So I would > like to see some benchmark results of your IO controller. > Fair enough. IO scheduler based IO controller is still work in progress and we have started to get some basic things right. I think after 3-4 iterations of patches, patches will be stable enough and working enough that I should be able to give some benchmark numbers also. Currently I am posting the intermediate snapshot of my tree to lkml to get the design feedback so that if there are fundamental design issues, we can sort these out. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-11 12:49 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-11 12:49 UTC (permalink / raw) To: Ryo Tsuruta Cc: akpm, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, peterz On Mon, May 11, 2009 at 08:23:09PM +0900, Ryo Tsuruta wrote: > Hi Vivek, > > From: Vivek Goyal <vgoyal@redhat.com> > Subject: Re: IO scheduler based IO Controller V2 > Date: Wed, 6 May 2009 21:25:59 -0400 > > > On Thu, May 07, 2009 at 09:18:58AM +0900, Ryo Tsuruta wrote: > > > Hi Vivek, > > > > > > > Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > > > > of FIFO dispatch of buffered bios. Apart from that it tries to provide > > > > fairness in terms of actual IO done and that would mean a seeky workload > > > > will can use disk for much longer to get equivalent IO done and slow down > > > > other applications. Implementing IO controller at IO scheduler level gives > > > > us tigher control. Will it not meet your requirements? If you got specific > > > > concerns with IO scheduler based contol patches, please highlight these and > > > > we will see how these can be addressed. > > > > > > I'd like to avoid making complicated existing IO schedulers and other > > > kernel codes and to give a choice to users whether or not to use it. > > > I know that you chose an approach that using compile time options to > > > get the same behavior as old system, but device-mapper drivers can be > > > added, removed and replaced while system is running. > > > > > > > Same is possible with IO scheduler based controller. If you don't want > > cgroup stuff, don't create those. By default everything will be in root > > group and you will get the old behavior. > > > > If you want io controller stuff, just create the cgroup, assign weight > > and move task there. So what more choices do you want which are missing > > here? > > What I mean to say is that device-mapper drivers can be completely > removed from the kernel if not used. > > I know that dm-ioband has some issues which can be addressed by your > IO controller, but I'm not sure your controller works well. So I would > like to see some benchmark results of your IO controller. > Fair enough. IO scheduler based IO controller is still work in progress and we have started to get some basic things right. I think after 3-4 iterations of patches, patches will be stable enough and working enough that I should be able to give some benchmark numbers also. Currently I am posting the intermediate snapshot of my tree to lkml to get the design feedback so that if there are fundamental design issues, we can sort these out. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090507.091858.226775723.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org> 2009-05-07 1:25 ` Vivek Goyal @ 2009-05-08 14:24 ` Rik van Riel 1 sibling, 0 replies; 97+ messages in thread From: Rik van Riel @ 2009-05-08 14:24 UTC (permalink / raw) To: Ryo Tsuruta Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Ryo Tsuruta wrote: > Hi Vivek, > >> Ryo, dm-ioband breaks the notion of classes and priority of CFQ because >> of FIFO dispatch of buffered bios. Apart from that it tries to provide >> fairness in terms of actual IO done and that would mean a seeky workload >> will can use disk for much longer to get equivalent IO done and slow down >> other applications. Implementing IO controller at IO scheduler level gives >> us tigher control. Will it not meet your requirements? If you got specific >> concerns with IO scheduler based contol patches, please highlight these and >> we will see how these can be addressed. > > I'd like to avoid making complicated existing IO schedulers and other > kernel codes and to give a choice to users whether or not to use it. > I know that you chose an approach that using compile time options to > get the same behavior as old system, but device-mapper drivers can be > added, removed and replaced while system is running. I do not believe that every use of cgroups will end up with a separate logical volume for each group. In fact, if you look at group-per-UID usage, which could be quite common on shared web servers and shell servers, I would expect all the groups to share the same filesystem. I do not believe dm-ioband would be useful in that configuration, while the IO scheduler based IO controller will just work. -- All rights reversed. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-07 0:18 ` Ryo Tsuruta [not found] ` <20090507.091858.226775723.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org> @ 2009-05-08 14:24 ` Rik van Riel [not found] ` <4A0440B2.7040300-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-11 10:11 ` Ryo Tsuruta 1 sibling, 2 replies; 97+ messages in thread From: Rik van Riel @ 2009-05-08 14:24 UTC (permalink / raw) To: Ryo Tsuruta Cc: vgoyal, akpm, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, peterz Ryo Tsuruta wrote: > Hi Vivek, > >> Ryo, dm-ioband breaks the notion of classes and priority of CFQ because >> of FIFO dispatch of buffered bios. Apart from that it tries to provide >> fairness in terms of actual IO done and that would mean a seeky workload >> will can use disk for much longer to get equivalent IO done and slow down >> other applications. Implementing IO controller at IO scheduler level gives >> us tigher control. Will it not meet your requirements? If you got specific >> concerns with IO scheduler based contol patches, please highlight these and >> we will see how these can be addressed. > > I'd like to avoid making complicated existing IO schedulers and other > kernel codes and to give a choice to users whether or not to use it. > I know that you chose an approach that using compile time options to > get the same behavior as old system, but device-mapper drivers can be > added, removed and replaced while system is running. I do not believe that every use of cgroups will end up with a separate logical volume for each group. In fact, if you look at group-per-UID usage, which could be quite common on shared web servers and shell servers, I would expect all the groups to share the same filesystem. I do not believe dm-ioband would be useful in that configuration, while the IO scheduler based IO controller will just work. -- All rights reversed. ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <4A0440B2.7040300-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <4A0440B2.7040300-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-11 10:11 ` Ryo Tsuruta 0 siblings, 0 replies; 97+ messages in thread From: Ryo Tsuruta @ 2009-05-11 10:11 UTC (permalink / raw) To: riel-H+wXaHxf7aLQT0dZR+AlfA Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hi Rik, From: Rik van Riel <riel-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> Subject: Re: IO scheduler based IO Controller V2 Date: Fri, 08 May 2009 10:24:50 -0400 > Ryo Tsuruta wrote: > > Hi Vivek, > > > >> Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > >> of FIFO dispatch of buffered bios. Apart from that it tries to provide > >> fairness in terms of actual IO done and that would mean a seeky workload > >> will can use disk for much longer to get equivalent IO done and slow down > >> other applications. Implementing IO controller at IO scheduler level gives > >> us tigher control. Will it not meet your requirements? If you got specific > >> concerns with IO scheduler based contol patches, please highlight these and > >> we will see how these can be addressed. > > I'd like to avoid making complicated existing IO schedulers and other > > kernel codes and to give a choice to users whether or not to use it. > > I know that you chose an approach that using compile time options to > > get the same behavior as old system, but device-mapper drivers can be > > added, removed and replaced while system is running. > > I do not believe that every use of cgroups will end up with > a separate logical volume for each group. > > In fact, if you look at group-per-UID usage, which could be > quite common on shared web servers and shell servers, I would > expect all the groups to share the same filesystem. > > I do not believe dm-ioband would be useful in that configuration, > while the IO scheduler based IO controller will just work. dm-ioband can control bandwidth on a per cgroup basis as same as Vivek's IO controller. Could you explain what do you want to do and how to configure the IO scheduler based IO controller in that case? Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-08 14:24 ` Rik van Riel [not found] ` <4A0440B2.7040300-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-11 10:11 ` Ryo Tsuruta 1 sibling, 0 replies; 97+ messages in thread From: Ryo Tsuruta @ 2009-05-11 10:11 UTC (permalink / raw) To: riel Cc: vgoyal, akpm, nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, fernando, s-uchida, taka, guijianfeng, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, peterz Hi Rik, From: Rik van Riel <riel@redhat.com> Subject: Re: IO scheduler based IO Controller V2 Date: Fri, 08 May 2009 10:24:50 -0400 > Ryo Tsuruta wrote: > > Hi Vivek, > > > >> Ryo, dm-ioband breaks the notion of classes and priority of CFQ because > >> of FIFO dispatch of buffered bios. Apart from that it tries to provide > >> fairness in terms of actual IO done and that would mean a seeky workload > >> will can use disk for much longer to get equivalent IO done and slow down > >> other applications. Implementing IO controller at IO scheduler level gives > >> us tigher control. Will it not meet your requirements? If you got specific > >> concerns with IO scheduler based contol patches, please highlight these and > >> we will see how these can be addressed. > > I'd like to avoid making complicated existing IO schedulers and other > > kernel codes and to give a choice to users whether or not to use it. > > I know that you chose an approach that using compile time options to > > get the same behavior as old system, but device-mapper drivers can be > > added, removed and replaced while system is running. > > I do not believe that every use of cgroups will end up with > a separate logical volume for each group. > > In fact, if you look at group-per-UID usage, which could be > quite common on shared web servers and shell servers, I would > expect all the groups to share the same filesystem. > > I do not believe dm-ioband would be useful in that configuration, > while the IO scheduler based IO controller will just work. dm-ioband can control bandwidth on a per cgroup basis as same as Vivek's IO controller. Could you explain what do you want to do and how to configure the IO scheduler based IO controller in that case? Thanks, Ryo Tsuruta ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090505132441.1705bfad.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090505132441.1705bfad.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> @ 2009-05-05 22:20 ` Peter Zijlstra 2009-05-06 2:33 ` Vivek Goyal 2009-05-06 3:41 ` Balbir Singh 2 siblings, 0 replies; 97+ messages in thread From: Peter Zijlstra @ 2009-05-05 22:20 UTC (permalink / raw) To: Andrew Morton Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA On Tue, 2009-05-05 at 13:24 -0700, Andrew Morton wrote: > On Tue, 5 May 2009 15:58:27 -0400 > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > Hi All, > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > ... > > Currently primarily two other IO controller proposals are out there. > > > > dm-ioband > > --------- > > This patch set is from Ryo Tsuruta from valinux. > > ... > > IO-throttling > > ------------- > > This patch set is from Andrea Righi provides max bandwidth controller. > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > Seriously, how are we to resolve this? We could lock me in a room and > cmoe back in 15 days, but there's no reason to believe that I'd emerge > with the best answer. > > I tend to think that a cgroup-based controller is the way to go. > Anything else will need to be wired up to cgroups _anyway_, and that > might end up messy. FWIW I subscribe to the io-scheduler faith as opposed to the device-mapper cult ;-) Also, I don't think a simple throttle will be very useful, a more mature solution should cater to more use cases. ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090505132441.1705bfad.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> 2009-05-05 22:20 ` Peter Zijlstra @ 2009-05-06 2:33 ` Vivek Goyal 2009-05-06 3:41 ` Balbir Singh 2 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 2:33 UTC (permalink / raw) To: Andrew Morton Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w On Tue, May 05, 2009 at 01:24:41PM -0700, Andrew Morton wrote: > On Tue, 5 May 2009 15:58:27 -0400 > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > Hi All, > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > ... > > Currently primarily two other IO controller proposals are out there. > > > > dm-ioband > > --------- > > This patch set is from Ryo Tsuruta from valinux. > > ... > > IO-throttling > > ------------- > > This patch set is from Andrea Righi provides max bandwidth controller. > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > Seriously, how are we to resolve this? We could lock me in a room and > cmoe back in 15 days, but there's no reason to believe that I'd emerge > with the best answer. > > I tend to think that a cgroup-based controller is the way to go. > Anything else will need to be wired up to cgroups _anyway_, and that > might end up messy. Hi Andrew, Sorry, did not get what do you mean by cgroup based controller? If you mean that we use cgroups for grouping tasks for controlling IO, then both IO scheduler based controller as well as io throttling proposal do that. dm-ioband also supports that up to some extent but it requires extra step of transferring cgroup grouping information to dm-ioband device using dm-tools. But if you meant that io-throttle patches, then I think it solves only part of the problem and that is max bw control. It does not offer minimum BW/minimum disk share gurantees as offered by proportional BW control. IOW, it supports upper limit control and does not support a work conserving IO controller which lets a group use the whole BW if competing groups are not present. IMHO, proportional BW control is an important feature which we will need and IIUC, io-throttle patches can't be easily extended to support proportional BW control, OTOH, one should be able to extend IO scheduler based proportional weight controller to also support max bw control. Andrea, last time you were planning to have a look at my patches and see if max bw controller can be implemented there. I got a feeling that it should not be too difficult to implement it there. We already have the hierarchical tree of io queues and groups in elevator layer and we run BFQ (WF2Q+) algorithm to select next queue to dispatch the IO from. It is just a matter of also keeping track of IO rate per queue/group and we should be easily be able to delay the dispatch of IO from a queue if its group has crossed the specified max bw. This should lead to less code and reduced complextiy (compared with the case where we do max bw control with io-throttling patches and proportional BW control using IO scheduler based control patches). So do you think that it would make sense to do max BW control along with proportional weight IO controller at IO scheduler? If yes, then we can work together and continue to develop this patchset to also support max bw control and meet your requirements and drop the io-throttling patches. The only thing which concerns me is the fact that IO scheduler does not have the view of higher level logical device. So if somebody has setup a software RAID and wants to put max BW limit on software raid device, this solution will not work. One shall have to live with max bw limits on individual disks (where io scheduler is actually running). Do your patches allow to put limit on software RAID devices also? Ryo, dm-ioband breaks the notion of classes and priority of CFQ because of FIFO dispatch of buffered bios. Apart from that it tries to provide fairness in terms of actual IO done and that would mean a seeky workload will can use disk for much longer to get equivalent IO done and slow down other applications. Implementing IO controller at IO scheduler level gives us tigher control. Will it not meet your requirements? If you got specific concerns with IO scheduler based contol patches, please highlight these and we will see how these can be addressed. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090505132441.1705bfad.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> 2009-05-05 22:20 ` Peter Zijlstra 2009-05-06 2:33 ` Vivek Goyal @ 2009-05-06 3:41 ` Balbir Singh 2 siblings, 0 replies; 97+ messages in thread From: Balbir Singh @ 2009-05-06 3:41 UTC (permalink / raw) To: Andrew Morton Cc: paolo.valente-rcYM44yAMweonA0d6jMUrA, dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, agk-H+wXaHxf7aLQT0dZR+AlfA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w * Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> [2009-05-05 13:24:41]: > On Tue, 5 May 2009 15:58:27 -0400 > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > Hi All, > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > ... > > Currently primarily two other IO controller proposals are out there. > > > > dm-ioband > > --------- > > This patch set is from Ryo Tsuruta from valinux. > > ... > > IO-throttling > > ------------- > > This patch set is from Andrea Righi provides max bandwidth controller. > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > Seriously, how are we to resolve this? We could lock me in a room and > cmoe back in 15 days, but there's no reason to believe that I'd emerge > with the best answer. > We are planning an IO mini-summit prior to the kernel summit (hopefully we'll all be able to attend and decide). -- Balbir ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-05 20:24 ` Andrew Morton ` (3 preceding siblings ...) (?) @ 2009-05-06 3:41 ` Balbir Singh 2009-05-06 13:28 ` Vivek Goyal [not found] ` <20090506034118.GC4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> -1 siblings, 2 replies; 97+ messages in thread From: Balbir Singh @ 2009-05-06 3:41 UTC (permalink / raw) To: Andrew Morton Cc: Vivek Goyal, dhaval, snitzer, dm-devel, jens.axboe, agk, paolo.valente, fernando, jmoyer, fchecconi, containers, linux-kernel, righi.andrea * Andrew Morton <akpm@linux-foundation.org> [2009-05-05 13:24:41]: > On Tue, 5 May 2009 15:58:27 -0400 > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > Hi All, > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > ... > > Currently primarily two other IO controller proposals are out there. > > > > dm-ioband > > --------- > > This patch set is from Ryo Tsuruta from valinux. > > ... > > IO-throttling > > ------------- > > This patch set is from Andrea Righi provides max bandwidth controller. > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > Seriously, how are we to resolve this? We could lock me in a room and > cmoe back in 15 days, but there's no reason to believe that I'd emerge > with the best answer. > We are planning an IO mini-summit prior to the kernel summit (hopefully we'll all be able to attend and decide). -- Balbir ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 3:41 ` Balbir Singh @ 2009-05-06 13:28 ` Vivek Goyal [not found] ` <20090506034118.GC4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> 1 sibling, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 13:28 UTC (permalink / raw) To: Balbir Singh Cc: Andrew Morton, dhaval, snitzer, dm-devel, jens.axboe, agk, paolo.valente, fernando, jmoyer, fchecconi, containers, linux-kernel, righi.andrea On Wed, May 06, 2009 at 09:11:18AM +0530, Balbir Singh wrote: > * Andrew Morton <akpm@linux-foundation.org> [2009-05-05 13:24:41]: > > > On Tue, 5 May 2009 15:58:27 -0400 > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > Hi All, > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > ... > > > Currently primarily two other IO controller proposals are out there. > > > > > > dm-ioband > > > --------- > > > This patch set is from Ryo Tsuruta from valinux. > > > ... > > > IO-throttling > > > ------------- > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > Seriously, how are we to resolve this? We could lock me in a room and > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > with the best answer. > > > > We are planning an IO mini-summit prior to the kernel summit > (hopefully we'll all be able to attend and decide). Hi Balbir, Mini summit is still few months away. I think a better idea would be to try to thrash out the details here on lkml and try to reach to some conclusion. Its a complicated problem and there are no simple and easy answers. If we can't reach a conclusion here, I am skeptical that mini summit will serve that purpose. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-06 13:28 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 13:28 UTC (permalink / raw) To: Balbir Singh Cc: paolo.valente, dhaval, snitzer, fernando, jmoyer, linux-kernel, fchecconi, dm-devel, jens.axboe, Andrew Morton, containers, agk, righi.andrea On Wed, May 06, 2009 at 09:11:18AM +0530, Balbir Singh wrote: > * Andrew Morton <akpm@linux-foundation.org> [2009-05-05 13:24:41]: > > > On Tue, 5 May 2009 15:58:27 -0400 > > Vivek Goyal <vgoyal@redhat.com> wrote: > > > > > > > > Hi All, > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > ... > > > Currently primarily two other IO controller proposals are out there. > > > > > > dm-ioband > > > --------- > > > This patch set is from Ryo Tsuruta from valinux. > > > ... > > > IO-throttling > > > ------------- > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > Seriously, how are we to resolve this? We could lock me in a room and > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > with the best answer. > > > > We are planning an IO mini-summit prior to the kernel summit > (hopefully we'll all be able to attend and decide). Hi Balbir, Mini summit is still few months away. I think a better idea would be to try to thrash out the details here on lkml and try to reach to some conclusion. Its a complicated problem and there are no simple and easy answers. If we can't reach a conclusion here, I am skeptical that mini summit will serve that purpose. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090506034118.GC4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506034118.GC4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> @ 2009-05-06 13:28 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 13:28 UTC (permalink / raw) To: Balbir Singh Cc: paolo.valente-rcYM44yAMweonA0d6jMUrA, dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, Andrew Morton, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, agk-H+wXaHxf7aLQT0dZR+AlfA, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w On Wed, May 06, 2009 at 09:11:18AM +0530, Balbir Singh wrote: > * Andrew Morton <akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> [2009-05-05 13:24:41]: > > > On Tue, 5 May 2009 15:58:27 -0400 > > Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> wrote: > > > > > > > > Hi All, > > > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > > ... > > > Currently primarily two other IO controller proposals are out there. > > > > > > dm-ioband > > > --------- > > > This patch set is from Ryo Tsuruta from valinux. > > > ... > > > IO-throttling > > > ------------- > > > This patch set is from Andrea Righi provides max bandwidth controller. > > > > I'm thinking we need to lock you guys in a room and come back in 15 minutes. > > > > Seriously, how are we to resolve this? We could lock me in a room and > > cmoe back in 15 days, but there's no reason to believe that I'd emerge > > with the best answer. > > > > We are planning an IO mini-summit prior to the kernel summit > (hopefully we'll all be able to attend and decide). Hi Balbir, Mini summit is still few months away. I think a better idea would be to try to thrash out the details here on lkml and try to reach to some conclusion. Its a complicated problem and there are no simple and easy answers. If we can't reach a conclusion here, I am skeptical that mini summit will serve that purpose. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <1241553525-28095-1-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-05 20:24 ` Andrew Morton @ 2009-05-06 8:11 ` Gui Jianfeng 1 sibling, 0 replies; 97+ messages in thread From: Gui Jianfeng @ 2009-05-06 8:11 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w Vivek Goyal wrote: > Hi All, > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > First version of the patches was posted here. Hi Vivek, I did some simple test for V2, and triggered an kernel panic. The following script can reproduce this bug. It seems that the cgroup is already removed, but IO Controller still try to access into it. #!/bin/sh echo 1 > /proc/sys/vm/drop_caches mkdir /cgroup 2> /dev/null mount -t cgroup -o io,blkio io /cgroup mkdir /cgroup/test1 mkdir /cgroup/test2 echo 100 > /cgroup/test1/io.weight echo 500 > /cgroup/test2/io.weight ./rwio -w -f 2000M.1 & //do async write pid1=$! echo $pid1 > /cgroup/test1/tasks ./rwio -w -f 2000M.2 & pid2=$! echo $pid2 > /cgroup/test2/tasks sleep 10 kill -9 $pid1 kill -9 $pid2 sleep 1 echo ====== cat /cgroup/test1/io.disk_time cat /cgroup/test2/io.disk_time echo ====== cat /cgroup/test1/io.disk_sectors cat /cgroup/test2/io.disk_sectors rmdir /cgroup/test1 rmdir /cgroup/test2 umount /cgroup rmdir /cgroup BUG: unable to handle kernel NULL pointer dereferec IP: [<c0448c24>] cgroup_path+0xc/0x97 *pde = 64d2d067 Oops: 0000 [#1] SMP last sysfs file: /sys/block/md0/range Modules linked in: ipv6 cpufreq_ondemand acpi_cpufreq dm_mirror dm_multipath sbd Pid: 132, comm: kblockd/0 Not tainted (2.6.30-rc4-Vivek-V2 #1) Veriton M460 EIP: 0060:[<c0448c24>] EFLAGS: 00010086 CPU: 0 EIP is at cgroup_path+0xc/0x97 EAX: 00000100 EBX: f60adca0 ECX: 00000080 EDX: f709fe28 ESI: f60adca8 EDI: f709fe28 EBP: 00000100 ESP: f709fdf0 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process kblockd/0 (pid: 132, ti=f709f000 task=f70a8f60 task.ti=f709f000) Stack: f709fe28 f68c5698 f60adca0 f60adca8 f709fe28 f68de801 c04f5389 00000080 f68de800 f7094d0c f6a29118 f68bde00 00000016 c04f5e8d c04f5340 00000080 c0579fec f68c5e94 00000082 c042edb4 f68c5fd4 f68c5fd4 c080b520 00000082 Call Trace: [<c04f5389>] ? io_group_path+0x6d/0x89 [<c04f5e8d>] ? elv_ioq_served+0x2a/0x7a [<c04f5340>] ? io_group_path+0x24/0x89 [<c0579fec>] ? ide_build_dmatable+0xda/0x130 [<c042edb4>] ? lock_timer_base+0x19/0x35 [<c042ef0c>] ? mod_timer+0x9f/0xa8 [<c04fdee6>] ? __delay+0x6/0x7 [<c057364f>] ? ide_execute_command+0x5d/0x71 [<c0579d4f>] ? ide_dma_intr+0x0/0x99 [<c0576496>] ? do_rw_taskfile+0x201/0x213 [<c04f6daa>] ? __elv_ioq_slice_expired+0x212/0x25e [<c04f7e15>] ? elv_fq_select_ioq+0x121/0x184 [<c04e8a2f>] ? elv_select_sched_queue+0x1e/0x2e [<c04f439c>] ? cfq_dispatch_requests+0xaa/0x238 [<c04e7e67>] ? elv_next_request+0x152/0x15f [<c04240c2>] ? dequeue_task_fair+0x16/0x2d [<c0572f49>] ? do_ide_request+0x10f/0x4c8 [<c0642d44>] ? __schedule+0x845/0x893 [<c042edb4>] ? lock_timer_base+0x19/0x35 [<c042f1be>] ? del_timer+0x41/0x47 [<c04ea5c6>] ? __generic_unplug_device+0x23/0x25 [<c04f530d>] ? elv_kick_queue+0x19/0x28 [<c0434b77>] ? worker_thread+0x11f/0x19e [<c04f52f4>] ? elv_kick_queue+0x0/0x28 [<c0436ffc>] ? autoremove_wake_function+0x0/0x2d [<c0434a58>] ? worker_thread+0x0/0x19e [<c0436f3b>] ? kthread+0x42/0x67 [<c0436ef9>] ? kthread+0x0/0x67 [<c040326f>] ? kernel_thread_helper+0x7/0x10 Code: c0 84 c0 74 0e 89 d8 e8 7c e9 fd ff eb 05 bf fd ff ff ff e8 c0 ea ff ff 8 EIP: [<c0448c24>] cgroup_path+0xc/0x97 SS:ESP 0068:f709fdf0 CR2: 000000000000011c ---[ end trace 2d4bc25a2c33e394 ]--- -- Regards Gui Jianfeng ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-05 19:58 Vivek Goyal [not found] ` <1241553525-28095-1-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-06 8:11 ` Gui Jianfeng [not found] ` <4A014619.1040000-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 1 sibling, 1 reply; 97+ messages in thread From: Gui Jianfeng @ 2009-05-06 8:11 UTC (permalink / raw) To: Vivek Goyal Cc: nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, akpm Vivek Goyal wrote: > Hi All, > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > First version of the patches was posted here. Hi Vivek, I did some simple test for V2, and triggered an kernel panic. The following script can reproduce this bug. It seems that the cgroup is already removed, but IO Controller still try to access into it. #!/bin/sh echo 1 > /proc/sys/vm/drop_caches mkdir /cgroup 2> /dev/null mount -t cgroup -o io,blkio io /cgroup mkdir /cgroup/test1 mkdir /cgroup/test2 echo 100 > /cgroup/test1/io.weight echo 500 > /cgroup/test2/io.weight ./rwio -w -f 2000M.1 & //do async write pid1=$! echo $pid1 > /cgroup/test1/tasks ./rwio -w -f 2000M.2 & pid2=$! echo $pid2 > /cgroup/test2/tasks sleep 10 kill -9 $pid1 kill -9 $pid2 sleep 1 echo ====== cat /cgroup/test1/io.disk_time cat /cgroup/test2/io.disk_time echo ====== cat /cgroup/test1/io.disk_sectors cat /cgroup/test2/io.disk_sectors rmdir /cgroup/test1 rmdir /cgroup/test2 umount /cgroup rmdir /cgroup BUG: unable to handle kernel NULL pointer dereferec IP: [<c0448c24>] cgroup_path+0xc/0x97 *pde = 64d2d067 Oops: 0000 [#1] SMP last sysfs file: /sys/block/md0/range Modules linked in: ipv6 cpufreq_ondemand acpi_cpufreq dm_mirror dm_multipath sbd Pid: 132, comm: kblockd/0 Not tainted (2.6.30-rc4-Vivek-V2 #1) Veriton M460 EIP: 0060:[<c0448c24>] EFLAGS: 00010086 CPU: 0 EIP is at cgroup_path+0xc/0x97 EAX: 00000100 EBX: f60adca0 ECX: 00000080 EDX: f709fe28 ESI: f60adca8 EDI: f709fe28 EBP: 00000100 ESP: f709fdf0 DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 Process kblockd/0 (pid: 132, ti=f709f000 task=f70a8f60 task.ti=f709f000) Stack: f709fe28 f68c5698 f60adca0 f60adca8 f709fe28 f68de801 c04f5389 00000080 f68de800 f7094d0c f6a29118 f68bde00 00000016 c04f5e8d c04f5340 00000080 c0579fec f68c5e94 00000082 c042edb4 f68c5fd4 f68c5fd4 c080b520 00000082 Call Trace: [<c04f5389>] ? io_group_path+0x6d/0x89 [<c04f5e8d>] ? elv_ioq_served+0x2a/0x7a [<c04f5340>] ? io_group_path+0x24/0x89 [<c0579fec>] ? ide_build_dmatable+0xda/0x130 [<c042edb4>] ? lock_timer_base+0x19/0x35 [<c042ef0c>] ? mod_timer+0x9f/0xa8 [<c04fdee6>] ? __delay+0x6/0x7 [<c057364f>] ? ide_execute_command+0x5d/0x71 [<c0579d4f>] ? ide_dma_intr+0x0/0x99 [<c0576496>] ? do_rw_taskfile+0x201/0x213 [<c04f6daa>] ? __elv_ioq_slice_expired+0x212/0x25e [<c04f7e15>] ? elv_fq_select_ioq+0x121/0x184 [<c04e8a2f>] ? elv_select_sched_queue+0x1e/0x2e [<c04f439c>] ? cfq_dispatch_requests+0xaa/0x238 [<c04e7e67>] ? elv_next_request+0x152/0x15f [<c04240c2>] ? dequeue_task_fair+0x16/0x2d [<c0572f49>] ? do_ide_request+0x10f/0x4c8 [<c0642d44>] ? __schedule+0x845/0x893 [<c042edb4>] ? lock_timer_base+0x19/0x35 [<c042f1be>] ? del_timer+0x41/0x47 [<c04ea5c6>] ? __generic_unplug_device+0x23/0x25 [<c04f530d>] ? elv_kick_queue+0x19/0x28 [<c0434b77>] ? worker_thread+0x11f/0x19e [<c04f52f4>] ? elv_kick_queue+0x0/0x28 [<c0436ffc>] ? autoremove_wake_function+0x0/0x2d [<c0434a58>] ? worker_thread+0x0/0x19e [<c0436f3b>] ? kthread+0x42/0x67 [<c0436ef9>] ? kthread+0x0/0x67 [<c040326f>] ? kernel_thread_helper+0x7/0x10 Code: c0 84 c0 74 0e 89 d8 e8 7c e9 fd ff eb 05 bf fd ff ff ff e8 c0 ea ff ff 8 EIP: [<c0448c24>] cgroup_path+0xc/0x97 SS:ESP 0068:f709fdf0 CR2: 000000000000011c ---[ end trace 2d4bc25a2c33e394 ]--- -- Regards Gui Jianfeng ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <4A014619.1040000-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 2009-05-06 8:11 ` Gui Jianfeng @ 2009-05-06 16:10 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 16:10 UTC (permalink / raw) To: Gui Jianfeng Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w On Wed, May 06, 2009 at 04:11:05PM +0800, Gui Jianfeng wrote: > Vivek Goyal wrote: > > Hi All, > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > First version of the patches was posted here. > > Hi Vivek, > > I did some simple test for V2, and triggered an kernel panic. > The following script can reproduce this bug. It seems that the cgroup > is already removed, but IO Controller still try to access into it. > Hi Gui, Thanks for the report. I use cgroup_path() for debugging. I guess that cgroup_path() was passed null cgrp pointer that's why it crashed. If yes, then it is strange though. I call cgroup_path() only after grabbing a refenrece to css object. (I am assuming that if I have a valid reference to css object then css->cgrp can't be null). Anyway, can you please try out following patch and see if it fixes your crash. --- block/elevator-fq.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) Index: linux11/block/elevator-fq.c =================================================================== --- linux11.orig/block/elevator-fq.c 2009-05-05 15:38:06.000000000 -0400 +++ linux11/block/elevator-fq.c 2009-05-06 11:55:47.000000000 -0400 @@ -125,6 +125,9 @@ static void io_group_path(struct io_grou unsigned short id = iog->iocg_id; struct cgroup_subsys_state *css; + /* For error case */ + buf[0] = '\0'; + rcu_read_lock(); if (!id) @@ -137,15 +140,12 @@ static void io_group_path(struct io_grou if (!css_tryget(css)) goto out; - cgroup_path(css->cgroup, buf, buflen); + if (css->cgroup) + cgroup_path(css->cgroup, buf, buflen); css_put(css); - - rcu_read_unlock(); - return; out: rcu_read_unlock(); - buf[0] = '\0'; return; } #endif BTW, I tried following equivalent script and I can't see the crash on my system. Are you able to hit it regularly? Instead of killing the tasks I also tried moving the tasks into root cgroup and then deleting test1 and test2 groups, that also did not produce any crash. (Hit a different bug though after 5-6 attempts :-) As I mentioned in the patchset, currently we do have issues with group refcounting and cgroup/group going away. Hopefully in next version they all should be fixed up. But still, it is nice to hear back... #!/bin/sh ../mount-cgroups.sh # Mount disk mount /dev/sdd1 /mnt/sdd1 mount /dev/sdd2 /mnt/sdd2 echo 1 > /proc/sys/vm/drop_caches dd if=/dev/zero of=/mnt/sdd1/testzerofile1 bs=4K count=524288 & pid1=$! echo $pid1 > /cgroup/bfqio/test1/tasks echo "Launched $pid1" dd if=/dev/zero of=/mnt/sdd2/testzerofile1 bs=4K count=524288 & pid2=$! echo $pid2 > /cgroup/bfqio/test2/tasks echo "Launched $pid2" #echo "sleeping for 10 seconds" #sleep 10 #echo "Killing pid $pid1" #kill -9 $pid1 #echo "Killing pid $pid2" #kill -9 $pid2 #sleep 5 echo "sleeping for 10 seconds" sleep 10 echo "moving pid $pid1 to root" echo $pid1 > /cgroup/bfqio/tasks echo "moving pid $pid2 to root" echo $pid2 > /cgroup/bfqio/tasks echo ====== cat /cgroup/bfqio/test1/io.disk_time cat /cgroup/bfqio/test2/io.disk_time echo ====== cat /cgroup/bfqio/test1/io.disk_sectors cat /cgroup/bfqio/test2/io.disk_sectors echo "Removing test1" rmdir /cgroup/bfqio/test1 echo "Removing test2" rmdir /cgroup/bfqio/test2 echo "Unmounting /cgroup" umount /cgroup/bfqio echo "Done" #rmdir /cgroup > #!/bin/sh > echo 1 > /proc/sys/vm/drop_caches > mkdir /cgroup 2> /dev/null > mount -t cgroup -o io,blkio io /cgroup > mkdir /cgroup/test1 > mkdir /cgroup/test2 > echo 100 > /cgroup/test1/io.weight > echo 500 > /cgroup/test2/io.weight > > ./rwio -w -f 2000M.1 & //do async write > pid1=$! > echo $pid1 > /cgroup/test1/tasks > > ./rwio -w -f 2000M.2 & > pid2=$! > echo $pid2 > /cgroup/test2/tasks > > sleep 10 > kill -9 $pid1 > kill -9 $pid2 > sleep 1 > > echo ====== > cat /cgroup/test1/io.disk_time > cat /cgroup/test2/io.disk_time > > echo ====== > cat /cgroup/test1/io.disk_sectors > cat /cgroup/test2/io.disk_sectors > > rmdir /cgroup/test1 > rmdir /cgroup/test2 > umount /cgroup > rmdir /cgroup > > > BUG: unable to handle kernel NULL pointer dereferec > IP: [<c0448c24>] cgroup_path+0xc/0x97 > *pde = 64d2d067 > Oops: 0000 [#1] SMP > last sysfs file: /sys/block/md0/range > Modules linked in: ipv6 cpufreq_ondemand acpi_cpufreq dm_mirror dm_multipath sbd > Pid: 132, comm: kblockd/0 Not tainted (2.6.30-rc4-Vivek-V2 #1) Veriton M460 > EIP: 0060:[<c0448c24>] EFLAGS: 00010086 CPU: 0 > EIP is at cgroup_path+0xc/0x97 > EAX: 00000100 EBX: f60adca0 ECX: 00000080 EDX: f709fe28 > ESI: f60adca8 EDI: f709fe28 EBP: 00000100 ESP: f709fdf0 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > Process kblockd/0 (pid: 132, ti=f709f000 task=f70a8f60 task.ti=f709f000) > Stack: > f709fe28 f68c5698 f60adca0 f60adca8 f709fe28 f68de801 c04f5389 00000080 > f68de800 f7094d0c f6a29118 f68bde00 00000016 c04f5e8d c04f5340 00000080 > c0579fec f68c5e94 00000082 c042edb4 f68c5fd4 f68c5fd4 c080b520 00000082 > Call Trace: > [<c04f5389>] ? io_group_path+0x6d/0x89 > [<c04f5e8d>] ? elv_ioq_served+0x2a/0x7a > [<c04f5340>] ? io_group_path+0x24/0x89 > [<c0579fec>] ? ide_build_dmatable+0xda/0x130 > [<c042edb4>] ? lock_timer_base+0x19/0x35 > [<c042ef0c>] ? mod_timer+0x9f/0xa8 > [<c04fdee6>] ? __delay+0x6/0x7 > [<c057364f>] ? ide_execute_command+0x5d/0x71 > [<c0579d4f>] ? ide_dma_intr+0x0/0x99 > [<c0576496>] ? do_rw_taskfile+0x201/0x213 > [<c04f6daa>] ? __elv_ioq_slice_expired+0x212/0x25e > [<c04f7e15>] ? elv_fq_select_ioq+0x121/0x184 > [<c04e8a2f>] ? elv_select_sched_queue+0x1e/0x2e > [<c04f439c>] ? cfq_dispatch_requests+0xaa/0x238 > [<c04e7e67>] ? elv_next_request+0x152/0x15f > [<c04240c2>] ? dequeue_task_fair+0x16/0x2d > [<c0572f49>] ? do_ide_request+0x10f/0x4c8 > [<c0642d44>] ? __schedule+0x845/0x893 > [<c042edb4>] ? lock_timer_base+0x19/0x35 > [<c042f1be>] ? del_timer+0x41/0x47 > [<c04ea5c6>] ? __generic_unplug_device+0x23/0x25 > [<c04f530d>] ? elv_kick_queue+0x19/0x28 > [<c0434b77>] ? worker_thread+0x11f/0x19e > [<c04f52f4>] ? elv_kick_queue+0x0/0x28 > [<c0436ffc>] ? autoremove_wake_function+0x0/0x2d > [<c0434a58>] ? worker_thread+0x0/0x19e > [<c0436f3b>] ? kthread+0x42/0x67 > [<c0436ef9>] ? kthread+0x0/0x67 > [<c040326f>] ? kernel_thread_helper+0x7/0x10 > Code: c0 84 c0 74 0e 89 d8 e8 7c e9 fd ff eb 05 bf fd ff ff ff e8 c0 ea ff ff 8 > EIP: [<c0448c24>] cgroup_path+0xc/0x97 SS:ESP 0068:f709fdf0 > CR2: 000000000000011c > ---[ end trace 2d4bc25a2c33e394 ]--- > > -- > Regards > Gui Jianfeng > ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-06 16:10 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-06 16:10 UTC (permalink / raw) To: Gui Jianfeng Cc: nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, akpm On Wed, May 06, 2009 at 04:11:05PM +0800, Gui Jianfeng wrote: > Vivek Goyal wrote: > > Hi All, > > > > Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > > First version of the patches was posted here. > > Hi Vivek, > > I did some simple test for V2, and triggered an kernel panic. > The following script can reproduce this bug. It seems that the cgroup > is already removed, but IO Controller still try to access into it. > Hi Gui, Thanks for the report. I use cgroup_path() for debugging. I guess that cgroup_path() was passed null cgrp pointer that's why it crashed. If yes, then it is strange though. I call cgroup_path() only after grabbing a refenrece to css object. (I am assuming that if I have a valid reference to css object then css->cgrp can't be null). Anyway, can you please try out following patch and see if it fixes your crash. --- block/elevator-fq.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) Index: linux11/block/elevator-fq.c =================================================================== --- linux11.orig/block/elevator-fq.c 2009-05-05 15:38:06.000000000 -0400 +++ linux11/block/elevator-fq.c 2009-05-06 11:55:47.000000000 -0400 @@ -125,6 +125,9 @@ static void io_group_path(struct io_grou unsigned short id = iog->iocg_id; struct cgroup_subsys_state *css; + /* For error case */ + buf[0] = '\0'; + rcu_read_lock(); if (!id) @@ -137,15 +140,12 @@ static void io_group_path(struct io_grou if (!css_tryget(css)) goto out; - cgroup_path(css->cgroup, buf, buflen); + if (css->cgroup) + cgroup_path(css->cgroup, buf, buflen); css_put(css); - - rcu_read_unlock(); - return; out: rcu_read_unlock(); - buf[0] = '\0'; return; } #endif BTW, I tried following equivalent script and I can't see the crash on my system. Are you able to hit it regularly? Instead of killing the tasks I also tried moving the tasks into root cgroup and then deleting test1 and test2 groups, that also did not produce any crash. (Hit a different bug though after 5-6 attempts :-) As I mentioned in the patchset, currently we do have issues with group refcounting and cgroup/group going away. Hopefully in next version they all should be fixed up. But still, it is nice to hear back... #!/bin/sh ../mount-cgroups.sh # Mount disk mount /dev/sdd1 /mnt/sdd1 mount /dev/sdd2 /mnt/sdd2 echo 1 > /proc/sys/vm/drop_caches dd if=/dev/zero of=/mnt/sdd1/testzerofile1 bs=4K count=524288 & pid1=$! echo $pid1 > /cgroup/bfqio/test1/tasks echo "Launched $pid1" dd if=/dev/zero of=/mnt/sdd2/testzerofile1 bs=4K count=524288 & pid2=$! echo $pid2 > /cgroup/bfqio/test2/tasks echo "Launched $pid2" #echo "sleeping for 10 seconds" #sleep 10 #echo "Killing pid $pid1" #kill -9 $pid1 #echo "Killing pid $pid2" #kill -9 $pid2 #sleep 5 echo "sleeping for 10 seconds" sleep 10 echo "moving pid $pid1 to root" echo $pid1 > /cgroup/bfqio/tasks echo "moving pid $pid2 to root" echo $pid2 > /cgroup/bfqio/tasks echo ====== cat /cgroup/bfqio/test1/io.disk_time cat /cgroup/bfqio/test2/io.disk_time echo ====== cat /cgroup/bfqio/test1/io.disk_sectors cat /cgroup/bfqio/test2/io.disk_sectors echo "Removing test1" rmdir /cgroup/bfqio/test1 echo "Removing test2" rmdir /cgroup/bfqio/test2 echo "Unmounting /cgroup" umount /cgroup/bfqio echo "Done" #rmdir /cgroup > #!/bin/sh > echo 1 > /proc/sys/vm/drop_caches > mkdir /cgroup 2> /dev/null > mount -t cgroup -o io,blkio io /cgroup > mkdir /cgroup/test1 > mkdir /cgroup/test2 > echo 100 > /cgroup/test1/io.weight > echo 500 > /cgroup/test2/io.weight > > ./rwio -w -f 2000M.1 & //do async write > pid1=$! > echo $pid1 > /cgroup/test1/tasks > > ./rwio -w -f 2000M.2 & > pid2=$! > echo $pid2 > /cgroup/test2/tasks > > sleep 10 > kill -9 $pid1 > kill -9 $pid2 > sleep 1 > > echo ====== > cat /cgroup/test1/io.disk_time > cat /cgroup/test2/io.disk_time > > echo ====== > cat /cgroup/test1/io.disk_sectors > cat /cgroup/test2/io.disk_sectors > > rmdir /cgroup/test1 > rmdir /cgroup/test2 > umount /cgroup > rmdir /cgroup > > > BUG: unable to handle kernel NULL pointer dereferec > IP: [<c0448c24>] cgroup_path+0xc/0x97 > *pde = 64d2d067 > Oops: 0000 [#1] SMP > last sysfs file: /sys/block/md0/range > Modules linked in: ipv6 cpufreq_ondemand acpi_cpufreq dm_mirror dm_multipath sbd > Pid: 132, comm: kblockd/0 Not tainted (2.6.30-rc4-Vivek-V2 #1) Veriton M460 > EIP: 0060:[<c0448c24>] EFLAGS: 00010086 CPU: 0 > EIP is at cgroup_path+0xc/0x97 > EAX: 00000100 EBX: f60adca0 ECX: 00000080 EDX: f709fe28 > ESI: f60adca8 EDI: f709fe28 EBP: 00000100 ESP: f709fdf0 > DS: 007b ES: 007b FS: 00d8 GS: 0000 SS: 0068 > Process kblockd/0 (pid: 132, ti=f709f000 task=f70a8f60 task.ti=f709f000) > Stack: > f709fe28 f68c5698 f60adca0 f60adca8 f709fe28 f68de801 c04f5389 00000080 > f68de800 f7094d0c f6a29118 f68bde00 00000016 c04f5e8d c04f5340 00000080 > c0579fec f68c5e94 00000082 c042edb4 f68c5fd4 f68c5fd4 c080b520 00000082 > Call Trace: > [<c04f5389>] ? io_group_path+0x6d/0x89 > [<c04f5e8d>] ? elv_ioq_served+0x2a/0x7a > [<c04f5340>] ? io_group_path+0x24/0x89 > [<c0579fec>] ? ide_build_dmatable+0xda/0x130 > [<c042edb4>] ? lock_timer_base+0x19/0x35 > [<c042ef0c>] ? mod_timer+0x9f/0xa8 > [<c04fdee6>] ? __delay+0x6/0x7 > [<c057364f>] ? ide_execute_command+0x5d/0x71 > [<c0579d4f>] ? ide_dma_intr+0x0/0x99 > [<c0576496>] ? do_rw_taskfile+0x201/0x213 > [<c04f6daa>] ? __elv_ioq_slice_expired+0x212/0x25e > [<c04f7e15>] ? elv_fq_select_ioq+0x121/0x184 > [<c04e8a2f>] ? elv_select_sched_queue+0x1e/0x2e > [<c04f439c>] ? cfq_dispatch_requests+0xaa/0x238 > [<c04e7e67>] ? elv_next_request+0x152/0x15f > [<c04240c2>] ? dequeue_task_fair+0x16/0x2d > [<c0572f49>] ? do_ide_request+0x10f/0x4c8 > [<c0642d44>] ? __schedule+0x845/0x893 > [<c042edb4>] ? lock_timer_base+0x19/0x35 > [<c042f1be>] ? del_timer+0x41/0x47 > [<c04ea5c6>] ? __generic_unplug_device+0x23/0x25 > [<c04f530d>] ? elv_kick_queue+0x19/0x28 > [<c0434b77>] ? worker_thread+0x11f/0x19e > [<c04f52f4>] ? elv_kick_queue+0x0/0x28 > [<c0436ffc>] ? autoremove_wake_function+0x0/0x2d > [<c0434a58>] ? worker_thread+0x0/0x19e > [<c0436f3b>] ? kthread+0x42/0x67 > [<c0436ef9>] ? kthread+0x0/0x67 > [<c040326f>] ? kernel_thread_helper+0x7/0x10 > Code: c0 84 c0 74 0e 89 d8 e8 7c e9 fd ff eb 05 bf fd ff ff ff e8 c0 ea ff ff 8 > EIP: [<c0448c24>] cgroup_path+0xc/0x97 SS:ESP 0068:f709fdf0 > CR2: 000000000000011c > ---[ end trace 2d4bc25a2c33e394 ]--- > > -- > Regards > Gui Jianfeng > ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 16:10 ` Vivek Goyal (?) @ 2009-05-07 5:36 ` Li Zefan [not found] ` <4A027348.6000808-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> -1 siblings, 1 reply; 97+ messages in thread From: Li Zefan @ 2009-05-07 5:36 UTC (permalink / raw) To: Vivek Goyal Cc: Gui Jianfeng, nauman, dpshah, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, akpm [-- Attachment #1: Type: text/plain, Size: 2886 bytes --] Vivek Goyal wrote: > On Wed, May 06, 2009 at 04:11:05PM +0800, Gui Jianfeng wrote: >> Vivek Goyal wrote: >>> Hi All, >>> >>> Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. >>> First version of the patches was posted here. >> Hi Vivek, >> >> I did some simple test for V2, and triggered an kernel panic. >> The following script can reproduce this bug. It seems that the cgroup >> is already removed, but IO Controller still try to access into it. >> > > Hi Gui, > > Thanks for the report. I use cgroup_path() for debugging. I guess that > cgroup_path() was passed null cgrp pointer that's why it crashed. > > If yes, then it is strange though. I call cgroup_path() only after > grabbing a refenrece to css object. (I am assuming that if I have a valid > reference to css object then css->cgrp can't be null). > Yes, css->cgrp shouldn't be NULL.. I doubt we hit a bug in cgroup here. The code dealing with css refcnt and cgroup rmdir has changed quite a lot, and is much more complex than it was. > Anyway, can you please try out following patch and see if it fixes your > crash. ... > BTW, I tried following equivalent script and I can't see the crash on > my system. Are you able to hit it regularly? > I modified the script like this: ====================== #!/bin/sh echo 1 > /proc/sys/vm/drop_caches mkdir /cgroup 2> /dev/null mount -t cgroup -o io,blkio io /cgroup mkdir /cgroup/test1 mkdir /cgroup/test2 echo 100 > /cgroup/test1/io.weight echo 500 > /cgroup/test2/io.weight dd if=/dev/zero bs=4096 count=128000 of=500M.1 & pid1=$! echo $pid1 > /cgroup/test1/tasks dd if=/dev/zero bs=4096 count=128000 of=500M.2 & pid2=$! echo $pid2 > /cgroup/test2/tasks sleep 5 kill -9 $pid1 kill -9 $pid2 for ((;count != 2;)) { rmdir /cgroup/test1 > /dev/null 2>&1 if [ $? -eq 0 ]; then count=$(( $count + 1 )) fi rmdir /cgroup/test2 > /dev/null 2>&1 if [ $? -eq 0 ]; then count=$(( $count + 1 )) fi } umount /cgroup rmdir /cgroup ====================== I ran this script and got lockdep BUG. Full log and my config are attached. Actually this can be triggered with the following steps on my box: # mount -t cgroup -o blkio,io xxx /mnt # mkdir /mnt/0 # echo $$ > /mnt/0/tasks # echo 3 > /proc/sys/vm/drop_cache # echo $$ > /mnt/tasks # rmdir /mnt/0 And when I ran the script for the second time, my box was freezed and I had to reset it. > Instead of killing the tasks I also tried moving the tasks into root cgroup > and then deleting test1 and test2 groups, that also did not produce any crash. > (Hit a different bug though after 5-6 attempts :-) > > As I mentioned in the patchset, currently we do have issues with group > refcounting and cgroup/group going away. Hopefully in next version they > all should be fixed up. But still, it is nice to hear back... > [-- Attachment #2: myconfig --] [-- Type: text/plain, Size: 64514 bytes --] # # Automatically generated make config: don't edit # Linux kernel version: 2.6.30-rc4 # Thu May 7 09:11:29 2009 # # CONFIG_64BIT is not set CONFIG_X86_32=y # CONFIG_X86_64 is not set CONFIG_X86=y CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig" CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y CONFIG_FAST_CMPXCHG_LOCAL=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y # CONFIG_RWSEM_GENERIC_SPINLOCK is not set CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y CONFIG_GENERIC_CALIBRATE_DELAY=y # CONFIG_GENERIC_TIME_VSYSCALL is not set CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_ARCH_HAS_DEFAULT_IDLE=y CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_HAVE_DYNAMIC_PER_CPU_AREA=y # CONFIG_HAVE_CPUMASK_OF_CPU_MAP is not set CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y # CONFIG_ZONE_DMA32 is not set CONFIG_ARCH_POPULATES_NODE_MAP=y # CONFIG_AUDIT_ARCH is not set CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_USE_GENERIC_SMP_HELPERS=y CONFIG_X86_32_SMP=y CONFIG_X86_HT=y CONFIG_X86_TRAMPOLINE=y CONFIG_X86_32_LAZY_GS=y CONFIG_KTIME_SCALAR=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y CONFIG_KERNEL_GZIP=y # CONFIG_KERNEL_BZIP2 is not set # CONFIG_KERNEL_LZMA is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_POSIX_MQUEUE_SYSCTL=y CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_XACCT=y CONFIG_TASK_IO_ACCOUNTING=y # CONFIG_AUDIT is not set # # RCU Subsystem # # CONFIG_CLASSIC_RCU is not set # CONFIG_TREE_RCU is not set CONFIG_PREEMPT_RCU=y CONFIG_RCU_TRACE=y # CONFIG_TREE_RCU_TRACE is not set CONFIG_PREEMPT_RCU_TRACE=y # CONFIG_IKCONFIG is not set CONFIG_LOG_BUF_SHIFT=17 CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y CONFIG_GROUP_SCHED=y CONFIG_FAIR_GROUP_SCHED=y CONFIG_RT_GROUP_SCHED=y # CONFIG_USER_SCHED is not set CONFIG_CGROUP_SCHED=y CONFIG_CGROUPS=y CONFIG_CGROUP_DEBUG=y CONFIG_CGROUP_NS=y CONFIG_CGROUP_FREEZER=y CONFIG_CGROUP_DEVICE=y CONFIG_CPUSETS=y CONFIG_PROC_PID_CPUSET=y CONFIG_CGROUP_CPUACCT=y CONFIG_RESOURCE_COUNTERS=y CONFIG_CGROUP_MEM_RES_CTLR=y CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y CONFIG_GROUP_IOSCHED=y CONFIG_CGROUP_BLKIO=y CONFIG_CGROUP_PAGE=y CONFIG_MM_OWNER=y CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y CONFIG_RELAY=y CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_IPC_NS is not set CONFIG_USER_NS=y CONFIG_PID_NS=y # CONFIG_NET_NS is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" CONFIG_RD_GZIP=y CONFIG_RD_BZIP2=y CONFIG_RD_LZMA=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_ANON_INODES=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y CONFIG_KALLSYMS_EXTRA_PASS=y # CONFIG_STRIP_ASM_SYMS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_PCSPKR_PLATFORM=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_AIO=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_PCI_QUIRKS=y CONFIG_SLUB_DEBUG=y CONFIG_COMPAT_BRK=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_PROFILING=y CONFIG_TRACEPOINTS=y CONFIG_MARKERS=y CONFIG_OPROFILE=m # CONFIG_OPROFILE_IBS is not set CONFIG_HAVE_OPROFILE=y CONFIG_KPROBES=y CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y CONFIG_KRETPROBES=y CONFIG_HAVE_IOREMAP_PROT=y CONFIG_HAVE_KPROBES=y CONFIG_HAVE_KRETPROBES=y CONFIG_HAVE_ARCH_TRACEHOOK=y CONFIG_HAVE_DMA_API_DEBUG=y # CONFIG_SLOW_WORK is not set CONFIG_HAVE_GENERIC_DMA_COHERENT=y CONFIG_SLABINFO=y CONFIG_RT_MUTEXES=y CONFIG_BASE_SMALL=0 CONFIG_MODULES=y # CONFIG_MODULE_FORCE_LOAD is not set CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_STOP_MACHINE=y CONFIG_BLOCK=y CONFIG_LBD=y CONFIG_BLK_DEV_BSG=y # CONFIG_BLK_DEV_INTEGRITY is not set # # IO Schedulers # CONFIG_ELV_FAIR_QUEUING=y CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_NOOP_HIER=y CONFIG_IOSCHED_AS=m CONFIG_IOSCHED_AS_HIER=y CONFIG_IOSCHED_DEADLINE=m CONFIG_IOSCHED_DEADLINE_HIER=y CONFIG_IOSCHED_CFQ=y CONFIG_IOSCHED_CFQ_HIER=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="cfq" CONFIG_TRACK_ASYNC_CONTEXT=y CONFIG_DEBUG_GROUP_IOSCHED=y CONFIG_FREEZER=y # # Processor type and features # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_GENERIC_CLOCKEVENTS_BUILD=y CONFIG_SMP=y # CONFIG_SPARSE_IRQ is not set CONFIG_X86_MPPARSE=y # CONFIG_X86_BIGSMP is not set CONFIG_X86_EXTENDED_PLATFORM=y # CONFIG_X86_ELAN is not set # CONFIG_X86_RDC321X is not set # CONFIG_X86_32_NON_STANDARD is not set CONFIG_SCHED_OMIT_FRAME_POINTER=y # CONFIG_PARAVIRT_GUEST is not set # CONFIG_MEMTEST is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set CONFIG_M686=y # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_MPSC is not set # CONFIG_MCORE2 is not set # CONFIG_GENERIC_CPU is not set CONFIG_X86_GENERIC=y CONFIG_X86_CPU=y CONFIG_X86_L1_CACHE_BYTES=64 CONFIG_X86_INTERNODE_CACHE_BYTES=64 CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=5 CONFIG_X86_XADD=y CONFIG_X86_PPRO_FENCE=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_CPU_FAMILY=4 CONFIG_X86_DEBUGCTLMSR=y CONFIG_CPU_SUP_INTEL=y CONFIG_CPU_SUP_CYRIX_32=y CONFIG_CPU_SUP_AMD=y CONFIG_CPU_SUP_CENTAUR=y CONFIG_CPU_SUP_TRANSMETA_32=y CONFIG_CPU_SUP_UMC_32=y # CONFIG_X86_DS is not set CONFIG_HPET_TIMER=y CONFIG_HPET_EMULATE_RTC=y CONFIG_DMI=y # CONFIG_IOMMU_HELPER is not set # CONFIG_IOMMU_API is not set CONFIG_NR_CPUS=8 # CONFIG_SCHED_SMT is not set CONFIG_SCHED_MC=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y # CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set CONFIG_X86_MCE=y # CONFIG_X86_MCE_NONFATAL is not set # CONFIG_X86_MCE_P4THERMAL is not set CONFIG_VM86=y # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set # CONFIG_X86_REBOOTFIXUPS is not set # CONFIG_MICROCODE is not set CONFIG_X86_MSR=m CONFIG_X86_CPUID=m # CONFIG_X86_CPU_DEBUG is not set # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_PAGE_OFFSET=0xC0000000 CONFIG_HIGHMEM=y # CONFIG_ARCH_PHYS_ADDR_T_64BIT is not set CONFIG_ARCH_FLATMEM_ENABLE=y CONFIG_ARCH_SPARSEMEM_ENABLE=y CONFIG_ARCH_SELECT_MEMORY_MODEL=y CONFIG_SELECT_MEMORY_MODEL=y CONFIG_FLATMEM_MANUAL=y # CONFIG_DISCONTIGMEM_MANUAL is not set # CONFIG_SPARSEMEM_MANUAL is not set CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y CONFIG_SPARSEMEM_STATIC=y CONFIG_PAGEFLAGS_EXTENDED=y CONFIG_SPLIT_PTLOCK_CPUS=4 # CONFIG_PHYS_ADDR_T_64BIT is not set CONFIG_ZONE_DMA_FLAG=1 CONFIG_BOUNCE=y CONFIG_VIRT_TO_BUS=y CONFIG_UNEVICTABLE_LRU=y CONFIG_HAVE_MLOCK=y CONFIG_HAVE_MLOCKED_PAGE_BIT=y CONFIG_HIGHPTE=y # CONFIG_X86_CHECK_BIOS_CORRUPTION is not set CONFIG_X86_RESERVE_LOW_64K=y # CONFIG_MATH_EMULATION is not set CONFIG_MTRR=y CONFIG_MTRR_SANITIZER=y CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0 CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1 # CONFIG_X86_PAT is not set CONFIG_EFI=y CONFIG_SECCOMP=y # CONFIG_CC_STACKPROTECTOR is not set # CONFIG_HZ_100 is not set # CONFIG_HZ_250 is not set # CONFIG_HZ_300 is not set CONFIG_HZ_1000=y CONFIG_HZ=1000 CONFIG_SCHED_HRTICK=y CONFIG_KEXEC=y CONFIG_CRASH_DUMP=y CONFIG_PHYSICAL_START=0x1000000 CONFIG_RELOCATABLE=y CONFIG_PHYSICAL_ALIGN=0x400000 CONFIG_HOTPLUG_CPU=y # CONFIG_COMPAT_VDSO is not set # CONFIG_CMDLINE_BOOL is not set CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y # # Power management and ACPI options # CONFIG_PM=y CONFIG_PM_DEBUG=y # CONFIG_PM_VERBOSE is not set CONFIG_CAN_PM_TRACE=y # CONFIG_PM_TRACE_RTC is not set CONFIG_PM_SLEEP_SMP=y CONFIG_PM_SLEEP=y CONFIG_SUSPEND=y CONFIG_SUSPEND_FREEZER=y # CONFIG_HIBERNATION is not set CONFIG_ACPI=y CONFIG_ACPI_SLEEP=y # CONFIG_ACPI_PROCFS is not set # CONFIG_ACPI_PROCFS_POWER is not set CONFIG_ACPI_SYSFS_POWER=y # CONFIG_ACPI_PROC_EVENT is not set CONFIG_ACPI_AC=m # CONFIG_ACPI_BATTERY is not set CONFIG_ACPI_BUTTON=m CONFIG_ACPI_VIDEO=m CONFIG_ACPI_FAN=y CONFIG_ACPI_DOCK=y CONFIG_ACPI_PROCESSOR=y CONFIG_ACPI_HOTPLUG_CPU=y CONFIG_ACPI_THERMAL=y # CONFIG_ACPI_CUSTOM_DSDT is not set CONFIG_ACPI_BLACKLIST_YEAR=1999 # CONFIG_ACPI_DEBUG is not set # CONFIG_ACPI_PCI_SLOT is not set CONFIG_X86_PM_TIMER=y CONFIG_ACPI_CONTAINER=y # CONFIG_ACPI_SBS is not set CONFIG_X86_APM_BOOT=y CONFIG_APM=y # CONFIG_APM_IGNORE_USER_SUSPEND is not set # CONFIG_APM_DO_ENABLE is not set CONFIG_APM_CPU_IDLE=y # CONFIG_APM_DISPLAY_BLANK is not set # CONFIG_APM_ALLOW_INTS is not set # # CPU Frequency scaling # CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y CONFIG_CPU_FREQ_DEBUG=y CONFIG_CPU_FREQ_STAT=m CONFIG_CPU_FREQ_STAT_DETAILS=y # CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y # CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set CONFIG_CPU_FREQ_GOV_PERFORMANCE=y CONFIG_CPU_FREQ_GOV_POWERSAVE=m CONFIG_CPU_FREQ_GOV_USERSPACE=y CONFIG_CPU_FREQ_GOV_ONDEMAND=m CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m # # CPUFreq processor drivers # # CONFIG_X86_ACPI_CPUFREQ is not set # CONFIG_X86_POWERNOW_K6 is not set # CONFIG_X86_POWERNOW_K7 is not set # CONFIG_X86_POWERNOW_K8 is not set # CONFIG_X86_GX_SUSPMOD is not set # CONFIG_X86_SPEEDSTEP_CENTRINO is not set CONFIG_X86_SPEEDSTEP_ICH=y CONFIG_X86_SPEEDSTEP_SMI=y # CONFIG_X86_P4_CLOCKMOD is not set # CONFIG_X86_CPUFREQ_NFORCE2 is not set # CONFIG_X86_LONGRUN is not set # CONFIG_X86_LONGHAUL is not set # CONFIG_X86_E_POWERSAVER is not set # # shared options # CONFIG_X86_SPEEDSTEP_LIB=y # CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK is not set CONFIG_CPU_IDLE=y CONFIG_CPU_IDLE_GOV_LADDER=y CONFIG_CPU_IDLE_GOV_MENU=y # # Bus options (PCI etc.) # CONFIG_PCI=y # CONFIG_PCI_GOBIOS is not set # CONFIG_PCI_GOMMCONFIG is not set # CONFIG_PCI_GODIRECT is not set # CONFIG_PCI_GOOLPC is not set CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_MMCONFIG=y CONFIG_PCI_DOMAINS=y CONFIG_PCIEPORTBUS=y CONFIG_HOTPLUG_PCI_PCIE=m CONFIG_PCIEAER=y # CONFIG_PCIEASPM is not set CONFIG_ARCH_SUPPORTS_MSI=y # CONFIG_PCI_MSI is not set CONFIG_PCI_LEGACY=y # CONFIG_PCI_DEBUG is not set # CONFIG_PCI_STUB is not set CONFIG_HT_IRQ=y # CONFIG_PCI_IOV is not set CONFIG_ISA_DMA_API=y CONFIG_ISA=y # CONFIG_EISA is not set # CONFIG_MCA is not set # CONFIG_SCx200 is not set # CONFIG_OLPC is not set CONFIG_PCCARD=y # CONFIG_PCMCIA_DEBUG is not set CONFIG_PCMCIA=y CONFIG_PCMCIA_LOAD_CIS=y # CONFIG_PCMCIA_IOCTL is not set CONFIG_CARDBUS=y # # PC-card bridges # CONFIG_YENTA=y CONFIG_YENTA_O2=y CONFIG_YENTA_RICOH=y CONFIG_YENTA_TI=y CONFIG_YENTA_ENE_TUNE=y CONFIG_YENTA_TOSHIBA=y # CONFIG_PD6729 is not set # CONFIG_I82092 is not set # CONFIG_I82365 is not set # CONFIG_TCIC is not set CONFIG_PCMCIA_PROBE=y CONFIG_PCCARD_NONSTATIC=y CONFIG_HOTPLUG_PCI=y CONFIG_HOTPLUG_PCI_FAKE=m # CONFIG_HOTPLUG_PCI_COMPAQ is not set # CONFIG_HOTPLUG_PCI_IBM is not set CONFIG_HOTPLUG_PCI_ACPI=m CONFIG_HOTPLUG_PCI_ACPI_IBM=m # CONFIG_HOTPLUG_PCI_CPCI is not set # CONFIG_HOTPLUG_PCI_SHPC is not set # # Executable file formats / Emulations # CONFIG_BINFMT_ELF=y # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set CONFIG_HAVE_AOUT=y # CONFIG_BINFMT_AOUT is not set CONFIG_BINFMT_MISC=y CONFIG_HAVE_ATOMIC_IOMAP=y CONFIG_NET=y # # Networking options # CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_UNIX=y # CONFIG_NET_KEY is not set CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_ASK_IP_FIB_HASH=y # CONFIG_IP_FIB_TRIE is not set CONFIG_IP_FIB_HASH=y CONFIG_IP_MULTIPLE_TABLES=y CONFIG_IP_ROUTE_MULTIPATH=y CONFIG_IP_ROUTE_VERBOSE=y # CONFIG_IP_PNP is not set CONFIG_NET_IPIP=m # CONFIG_NET_IPGRE is not set CONFIG_IP_MROUTE=y CONFIG_IP_PIMSM_V1=y CONFIG_IP_PIMSM_V2=y # CONFIG_ARPD is not set CONFIG_SYN_COOKIES=y # CONFIG_INET_AH is not set # CONFIG_INET_ESP is not set # CONFIG_INET_IPCOMP is not set # CONFIG_INET_XFRM_TUNNEL is not set CONFIG_INET_TUNNEL=m # CONFIG_INET_XFRM_MODE_TRANSPORT is not set # CONFIG_INET_XFRM_MODE_TUNNEL is not set # CONFIG_INET_XFRM_MODE_BEET is not set CONFIG_INET_LRO=m CONFIG_INET_DIAG=m CONFIG_INET_TCP_DIAG=m CONFIG_TCP_CONG_ADVANCED=y CONFIG_TCP_CONG_BIC=m CONFIG_TCP_CONG_CUBIC=y # CONFIG_TCP_CONG_WESTWOOD is not set # CONFIG_TCP_CONG_HTCP is not set CONFIG_TCP_CONG_HSTCP=m CONFIG_TCP_CONG_HYBLA=m # CONFIG_TCP_CONG_VEGAS is not set CONFIG_TCP_CONG_SCALABLE=m CONFIG_TCP_CONG_LP=m # CONFIG_TCP_CONG_VENO is not set # CONFIG_TCP_CONG_YEAH is not set CONFIG_TCP_CONG_ILLINOIS=m # CONFIG_DEFAULT_BIC is not set CONFIG_DEFAULT_CUBIC=y # CONFIG_DEFAULT_HTCP is not set # CONFIG_DEFAULT_VEGAS is not set # CONFIG_DEFAULT_WESTWOOD is not set # CONFIG_DEFAULT_RENO is not set CONFIG_DEFAULT_TCP_CONG="cubic" # CONFIG_TCP_MD5SIG is not set # CONFIG_IPV6 is not set # CONFIG_NETWORK_SECMARK is not set # CONFIG_NETFILTER is not set # CONFIG_IP_DCCP is not set # CONFIG_IP_SCTP is not set # CONFIG_TIPC is not set # CONFIG_ATM is not set CONFIG_STP=m CONFIG_BRIDGE=m # CONFIG_NET_DSA is not set # CONFIG_VLAN_8021Q is not set # CONFIG_DECNET is not set CONFIG_LLC=m # CONFIG_LLC2 is not set # CONFIG_IPX is not set # CONFIG_ATALK is not set # CONFIG_X25 is not set # CONFIG_LAPB is not set # CONFIG_ECONET is not set # CONFIG_WAN_ROUTER is not set # CONFIG_PHONET is not set CONFIG_NET_SCHED=y # # Queueing/Scheduling # # CONFIG_NET_SCH_CBQ is not set # CONFIG_NET_SCH_HTB is not set # CONFIG_NET_SCH_HFSC is not set # CONFIG_NET_SCH_PRIO is not set # CONFIG_NET_SCH_MULTIQ is not set # CONFIG_NET_SCH_RED is not set # CONFIG_NET_SCH_SFQ is not set # CONFIG_NET_SCH_TEQL is not set # CONFIG_NET_SCH_TBF is not set # CONFIG_NET_SCH_GRED is not set # CONFIG_NET_SCH_DSMARK is not set # CONFIG_NET_SCH_NETEM is not set # CONFIG_NET_SCH_DRR is not set # # Classification # CONFIG_NET_CLS=y # CONFIG_NET_CLS_BASIC is not set # CONFIG_NET_CLS_TCINDEX is not set # CONFIG_NET_CLS_ROUTE4 is not set # CONFIG_NET_CLS_FW is not set # CONFIG_NET_CLS_U32 is not set # CONFIG_NET_CLS_RSVP is not set # CONFIG_NET_CLS_RSVP6 is not set # CONFIG_NET_CLS_FLOW is not set CONFIG_NET_CLS_CGROUP=y # CONFIG_NET_EMATCH is not set # CONFIG_NET_CLS_ACT is not set CONFIG_NET_SCH_FIFO=y # CONFIG_DCB is not set # # Network testing # # CONFIG_NET_PKTGEN is not set # CONFIG_NET_TCPPROBE is not set # CONFIG_NET_DROP_MONITOR is not set # CONFIG_HAMRADIO is not set # CONFIG_CAN is not set # CONFIG_IRDA is not set # CONFIG_BT is not set # CONFIG_AF_RXRPC is not set CONFIG_FIB_RULES=y # CONFIG_WIRELESS is not set # CONFIG_WIMAX is not set # CONFIG_RFKILL is not set # CONFIG_NET_9P is not set # # Device Drivers # # # Generic Driver Options # CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE="" # CONFIG_DEBUG_DRIVER is not set CONFIG_DEBUG_DEVRES=y # CONFIG_SYS_HYPERVISOR is not set # CONFIG_CONNECTOR is not set # CONFIG_MTD is not set CONFIG_PARPORT=m CONFIG_PARPORT_PC=m CONFIG_PARPORT_SERIAL=m # CONFIG_PARPORT_PC_FIFO is not set # CONFIG_PARPORT_PC_SUPERIO is not set CONFIG_PARPORT_PC_PCMCIA=m # CONFIG_PARPORT_GSC is not set # CONFIG_PARPORT_AX88796 is not set CONFIG_PARPORT_1284=y CONFIG_PNP=y CONFIG_PNP_DEBUG_MESSAGES=y # # Protocols # CONFIG_ISAPNP=y # CONFIG_PNPBIOS is not set CONFIG_PNPACPI=y CONFIG_BLK_DEV=y # CONFIG_BLK_DEV_FD is not set # CONFIG_BLK_DEV_XD is not set CONFIG_PARIDE=m # # Parallel IDE high-level drivers # CONFIG_PARIDE_PD=m CONFIG_PARIDE_PCD=m CONFIG_PARIDE_PF=m # CONFIG_PARIDE_PT is not set CONFIG_PARIDE_PG=m # # Parallel IDE protocol modules # # CONFIG_PARIDE_ATEN is not set # CONFIG_PARIDE_BPCK is not set # CONFIG_PARIDE_BPCK6 is not set # CONFIG_PARIDE_COMM is not set # CONFIG_PARIDE_DSTR is not set # CONFIG_PARIDE_FIT2 is not set # CONFIG_PARIDE_FIT3 is not set # CONFIG_PARIDE_EPAT is not set # CONFIG_PARIDE_EPIA is not set # CONFIG_PARIDE_FRIQ is not set # CONFIG_PARIDE_FRPW is not set # CONFIG_PARIDE_KBIC is not set # CONFIG_PARIDE_KTTI is not set # CONFIG_PARIDE_ON20 is not set # CONFIG_PARIDE_ON26 is not set # CONFIG_BLK_CPQ_DA is not set # CONFIG_BLK_CPQ_CISS_DA is not set # CONFIG_BLK_DEV_DAC960 is not set # CONFIG_BLK_DEV_UMEM is not set # CONFIG_BLK_DEV_COW_COMMON is not set CONFIG_BLK_DEV_LOOP=m CONFIG_BLK_DEV_CRYPTOLOOP=m CONFIG_BLK_DEV_NBD=m # CONFIG_BLK_DEV_SX8 is not set # CONFIG_BLK_DEV_UB is not set CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_COUNT=16 CONFIG_BLK_DEV_RAM_SIZE=16384 # CONFIG_BLK_DEV_XIP is not set # CONFIG_CDROM_PKTCDVD is not set # CONFIG_ATA_OVER_ETH is not set # CONFIG_BLK_DEV_HD is not set CONFIG_MISC_DEVICES=y # CONFIG_IBM_ASM is not set # CONFIG_PHANTOM is not set # CONFIG_SGI_IOC4 is not set # CONFIG_TIFM_CORE is not set # CONFIG_ICS932S401 is not set # CONFIG_ENCLOSURE_SERVICES is not set # CONFIG_HP_ILO is not set # CONFIG_ISL29003 is not set # CONFIG_C2PORT is not set # # EEPROM support # # CONFIG_EEPROM_AT24 is not set # CONFIG_EEPROM_LEGACY is not set CONFIG_EEPROM_93CX6=m CONFIG_HAVE_IDE=y # CONFIG_IDE is not set # # SCSI device support # # CONFIG_RAID_ATTRS is not set CONFIG_SCSI=m CONFIG_SCSI_DMA=y CONFIG_SCSI_TGT=m CONFIG_SCSI_NETLINK=y CONFIG_SCSI_PROC_FS=y # # SCSI support type (disk, tape, CD-ROM) # CONFIG_BLK_DEV_SD=m # CONFIG_CHR_DEV_ST is not set # CONFIG_CHR_DEV_OSST is not set CONFIG_BLK_DEV_SR=m CONFIG_BLK_DEV_SR_VENDOR=y CONFIG_CHR_DEV_SG=m CONFIG_CHR_DEV_SCH=m # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # CONFIG_SCSI_MULTI_LUN=y # CONFIG_SCSI_CONSTANTS is not set CONFIG_SCSI_LOGGING=y CONFIG_SCSI_SCAN_ASYNC=y CONFIG_SCSI_WAIT_SCAN=m # # SCSI Transports # CONFIG_SCSI_SPI_ATTRS=m CONFIG_SCSI_FC_ATTRS=m # CONFIG_SCSI_FC_TGT_ATTRS is not set CONFIG_SCSI_ISCSI_ATTRS=m CONFIG_SCSI_SAS_ATTRS=m CONFIG_SCSI_SAS_LIBSAS=m CONFIG_SCSI_SAS_ATA=y CONFIG_SCSI_SAS_HOST_SMP=y # CONFIG_SCSI_SAS_LIBSAS_DEBUG is not set CONFIG_SCSI_SRP_ATTRS=m # CONFIG_SCSI_SRP_TGT_ATTRS is not set CONFIG_SCSI_LOWLEVEL=y CONFIG_ISCSI_TCP=m # CONFIG_BLK_DEV_3W_XXXX_RAID is not set # CONFIG_SCSI_3W_9XXX is not set # CONFIG_SCSI_7000FASST is not set CONFIG_SCSI_ACARD=m # CONFIG_SCSI_AHA152X is not set # CONFIG_SCSI_AHA1542 is not set # CONFIG_SCSI_AACRAID is not set CONFIG_SCSI_AIC7XXX=m CONFIG_AIC7XXX_CMDS_PER_DEVICE=4 CONFIG_AIC7XXX_RESET_DELAY_MS=15000 # CONFIG_AIC7XXX_DEBUG_ENABLE is not set CONFIG_AIC7XXX_DEBUG_MASK=0 # CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set CONFIG_SCSI_AIC7XXX_OLD=m CONFIG_SCSI_AIC79XX=m CONFIG_AIC79XX_CMDS_PER_DEVICE=4 CONFIG_AIC79XX_RESET_DELAY_MS=15000 # CONFIG_AIC79XX_DEBUG_ENABLE is not set CONFIG_AIC79XX_DEBUG_MASK=0 # CONFIG_AIC79XX_REG_PRETTY_PRINT is not set CONFIG_SCSI_AIC94XX=m # CONFIG_AIC94XX_DEBUG is not set # CONFIG_SCSI_DPT_I2O is not set CONFIG_SCSI_ADVANSYS=m # CONFIG_SCSI_IN2000 is not set # CONFIG_SCSI_ARCMSR is not set # CONFIG_MEGARAID_NEWGEN is not set # CONFIG_MEGARAID_LEGACY is not set # CONFIG_MEGARAID_SAS is not set # CONFIG_SCSI_MPT2SAS is not set # CONFIG_SCSI_HPTIOP is not set CONFIG_SCSI_BUSLOGIC=m # CONFIG_SCSI_FLASHPOINT is not set # CONFIG_LIBFC is not set # CONFIG_LIBFCOE is not set # CONFIG_FCOE is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_DTC3280 is not set # CONFIG_SCSI_EATA is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set CONFIG_SCSI_GDTH=m # CONFIG_SCSI_GENERIC_NCR5380 is not set # CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set CONFIG_SCSI_IPS=m CONFIG_SCSI_INITIO=m CONFIG_SCSI_INIA100=m CONFIG_SCSI_PPA=m CONFIG_SCSI_IMM=m # CONFIG_SCSI_IZIP_EPP16 is not set # CONFIG_SCSI_IZIP_SLOW_CTR is not set # CONFIG_SCSI_MVSAS is not set # CONFIG_SCSI_NCR53C406A is not set # CONFIG_SCSI_STEX is not set CONFIG_SCSI_SYM53C8XX_2=m CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1 CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 CONFIG_SCSI_SYM53C8XX_MMIO=y # CONFIG_SCSI_IPR is not set # CONFIG_SCSI_PAS16 is not set # CONFIG_SCSI_QLOGIC_FAS is not set # CONFIG_SCSI_QLOGIC_1280 is not set # CONFIG_SCSI_QLA_FC is not set # CONFIG_SCSI_QLA_ISCSI is not set # CONFIG_SCSI_LPFC is not set # CONFIG_SCSI_SYM53C416 is not set # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set # CONFIG_SCSI_T128 is not set # CONFIG_SCSI_U14_34F is not set # CONFIG_SCSI_ULTRASTOR is not set # CONFIG_SCSI_NSP32 is not set # CONFIG_SCSI_DEBUG is not set # CONFIG_SCSI_SRP is not set CONFIG_SCSI_LOWLEVEL_PCMCIA=y # CONFIG_PCMCIA_AHA152X is not set # CONFIG_PCMCIA_FDOMAIN is not set # CONFIG_PCMCIA_NINJA_SCSI is not set CONFIG_PCMCIA_QLOGIC=m # CONFIG_PCMCIA_SYM53C500 is not set # CONFIG_SCSI_DH is not set # CONFIG_SCSI_OSD_INITIATOR is not set CONFIG_ATA=m # CONFIG_ATA_NONSTANDARD is not set CONFIG_ATA_ACPI=y CONFIG_SATA_PMP=y CONFIG_SATA_AHCI=m # CONFIG_SATA_SIL24 is not set CONFIG_ATA_SFF=y # CONFIG_SATA_SVW is not set CONFIG_ATA_PIIX=m # CONFIG_SATA_MV is not set CONFIG_SATA_NV=m # CONFIG_PDC_ADMA is not set # CONFIG_SATA_QSTOR is not set # CONFIG_SATA_PROMISE is not set # CONFIG_SATA_SX4 is not set # CONFIG_SATA_SIL is not set CONFIG_SATA_SIS=m # CONFIG_SATA_ULI is not set # CONFIG_SATA_VIA is not set # CONFIG_SATA_VITESSE is not set # CONFIG_SATA_INIC162X is not set # CONFIG_PATA_ACPI is not set # CONFIG_PATA_ALI is not set # CONFIG_PATA_AMD is not set # CONFIG_PATA_ARTOP is not set CONFIG_PATA_ATIIXP=m # CONFIG_PATA_CMD640_PCI is not set # CONFIG_PATA_CMD64X is not set # CONFIG_PATA_CS5520 is not set # CONFIG_PATA_CS5530 is not set # CONFIG_PATA_CS5535 is not set # CONFIG_PATA_CS5536 is not set # CONFIG_PATA_CYPRESS is not set # CONFIG_PATA_EFAR is not set CONFIG_ATA_GENERIC=m # CONFIG_PATA_HPT366 is not set # CONFIG_PATA_HPT37X is not set # CONFIG_PATA_HPT3X2N is not set # CONFIG_PATA_HPT3X3 is not set # CONFIG_PATA_ISAPNP is not set # CONFIG_PATA_IT821X is not set # CONFIG_PATA_IT8213 is not set # CONFIG_PATA_JMICRON is not set # CONFIG_PATA_LEGACY is not set # CONFIG_PATA_TRIFLEX is not set # CONFIG_PATA_MARVELL is not set CONFIG_PATA_MPIIX=m # CONFIG_PATA_OLDPIIX is not set # CONFIG_PATA_NETCELL is not set # CONFIG_PATA_NINJA32 is not set # CONFIG_PATA_NS87410 is not set # CONFIG_PATA_NS87415 is not set # CONFIG_PATA_OPTI is not set # CONFIG_PATA_OPTIDMA is not set CONFIG_PATA_PCMCIA=m # CONFIG_PATA_PDC_OLD is not set # CONFIG_PATA_QDI is not set # CONFIG_PATA_RADISYS is not set # CONFIG_PATA_RZ1000 is not set # CONFIG_PATA_SC1200 is not set # CONFIG_PATA_SERVERWORKS is not set # CONFIG_PATA_PDC2027X is not set # CONFIG_PATA_SIL680 is not set CONFIG_PATA_SIS=m CONFIG_PATA_VIA=m # CONFIG_PATA_WINBOND is not set # CONFIG_PATA_WINBOND_VLB is not set # CONFIG_PATA_SCH is not set # CONFIG_MD is not set CONFIG_FUSION=y CONFIG_FUSION_SPI=m CONFIG_FUSION_FC=m # CONFIG_FUSION_SAS is not set CONFIG_FUSION_MAX_SGE=40 CONFIG_FUSION_CTL=m CONFIG_FUSION_LAN=m CONFIG_FUSION_LOGGING=y # # IEEE 1394 (FireWire) support # # # Enable only one of the two stacks, unless you know what you are doing # CONFIG_FIREWIRE=m CONFIG_FIREWIRE_OHCI=m CONFIG_FIREWIRE_OHCI_DEBUG=y CONFIG_FIREWIRE_SBP2=m # CONFIG_IEEE1394 is not set CONFIG_I2O=m # CONFIG_I2O_LCT_NOTIFY_ON_CHANGES is not set CONFIG_I2O_EXT_ADAPTEC=y CONFIG_I2O_CONFIG=m CONFIG_I2O_CONFIG_OLD_IOCTL=y CONFIG_I2O_BUS=m CONFIG_I2O_BLOCK=m CONFIG_I2O_SCSI=m CONFIG_I2O_PROC=m # CONFIG_MACINTOSH_DRIVERS is not set CONFIG_NETDEVICES=y CONFIG_COMPAT_NET_DEV_OPS=y CONFIG_DUMMY=m CONFIG_BONDING=m # CONFIG_MACVLAN is not set # CONFIG_EQUALIZER is not set CONFIG_TUN=m # CONFIG_VETH is not set # CONFIG_NET_SB1000 is not set # CONFIG_ARCNET is not set CONFIG_PHYLIB=m # # MII PHY device drivers # # CONFIG_MARVELL_PHY is not set # CONFIG_DAVICOM_PHY is not set # CONFIG_QSEMI_PHY is not set CONFIG_LXT_PHY=m # CONFIG_CICADA_PHY is not set # CONFIG_VITESSE_PHY is not set # CONFIG_SMSC_PHY is not set # CONFIG_BROADCOM_PHY is not set # CONFIG_ICPLUS_PHY is not set # CONFIG_REALTEK_PHY is not set # CONFIG_NATIONAL_PHY is not set # CONFIG_STE10XP is not set # CONFIG_LSI_ET1011C_PHY is not set # CONFIG_MDIO_BITBANG is not set CONFIG_NET_ETHERNET=y CONFIG_MII=m # CONFIG_HAPPYMEAL is not set # CONFIG_SUNGEM is not set # CONFIG_CASSINI is not set CONFIG_NET_VENDOR_3COM=y # CONFIG_EL1 is not set # CONFIG_EL2 is not set # CONFIG_ELPLUS is not set # CONFIG_EL16 is not set CONFIG_EL3=m # CONFIG_3C515 is not set CONFIG_VORTEX=m CONFIG_TYPHOON=m # CONFIG_LANCE is not set CONFIG_NET_VENDOR_SMC=y # CONFIG_WD80x3 is not set # CONFIG_ULTRA is not set # CONFIG_SMC9194 is not set # CONFIG_ETHOC is not set # CONFIG_NET_VENDOR_RACAL is not set # CONFIG_DNET is not set CONFIG_NET_TULIP=y CONFIG_DE2104X=m CONFIG_TULIP=m # CONFIG_TULIP_MWI is not set CONFIG_TULIP_MMIO=y # CONFIG_TULIP_NAPI is not set CONFIG_DE4X5=m CONFIG_WINBOND_840=m CONFIG_DM9102=m CONFIG_ULI526X=m CONFIG_PCMCIA_XIRCOM=m # CONFIG_AT1700 is not set # CONFIG_DEPCA is not set # CONFIG_HP100 is not set CONFIG_NET_ISA=y # CONFIG_E2100 is not set # CONFIG_EWRK3 is not set # CONFIG_EEXPRESS is not set # CONFIG_EEXPRESS_PRO is not set # CONFIG_HPLAN_PLUS is not set # CONFIG_HPLAN is not set # CONFIG_LP486E is not set # CONFIG_ETH16I is not set CONFIG_NE2000=m # CONFIG_ZNET is not set # CONFIG_SEEQ8005 is not set # CONFIG_IBM_NEW_EMAC_ZMII is not set # CONFIG_IBM_NEW_EMAC_RGMII is not set # CONFIG_IBM_NEW_EMAC_TAH is not set # CONFIG_IBM_NEW_EMAC_EMAC4 is not set # CONFIG_IBM_NEW_EMAC_NO_FLOW_CTRL is not set # CONFIG_IBM_NEW_EMAC_MAL_CLR_ICINTSTAT is not set # CONFIG_IBM_NEW_EMAC_MAL_COMMON_ERR is not set CONFIG_NET_PCI=y CONFIG_PCNET32=m CONFIG_AMD8111_ETH=m CONFIG_ADAPTEC_STARFIRE=m # CONFIG_AC3200 is not set # CONFIG_APRICOT is not set CONFIG_B44=m CONFIG_B44_PCI_AUTOSELECT=y CONFIG_B44_PCICORE_AUTOSELECT=y CONFIG_B44_PCI=y CONFIG_FORCEDETH=m CONFIG_FORCEDETH_NAPI=y # CONFIG_CS89x0 is not set CONFIG_E100=m # CONFIG_FEALNX is not set # CONFIG_NATSEMI is not set CONFIG_NE2K_PCI=m # CONFIG_8139CP is not set CONFIG_8139TOO=m # CONFIG_8139TOO_PIO is not set # CONFIG_8139TOO_TUNE_TWISTER is not set CONFIG_8139TOO_8129=y # CONFIG_8139_OLD_RX_RESET is not set # CONFIG_R6040 is not set CONFIG_SIS900=m # CONFIG_EPIC100 is not set # CONFIG_SMSC9420 is not set # CONFIG_SUNDANCE is not set # CONFIG_TLAN is not set CONFIG_VIA_RHINE=m CONFIG_VIA_RHINE_MMIO=y # CONFIG_SC92031 is not set CONFIG_NET_POCKET=y CONFIG_ATP=m CONFIG_DE600=m CONFIG_DE620=m # CONFIG_ATL2 is not set CONFIG_NETDEV_1000=y CONFIG_ACENIC=m # CONFIG_ACENIC_OMIT_TIGON_I is not set # CONFIG_DL2K is not set CONFIG_E1000=m CONFIG_E1000E=m # CONFIG_IP1000 is not set # CONFIG_IGB is not set # CONFIG_IGBVF is not set # CONFIG_NS83820 is not set # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set CONFIG_R8169=m # CONFIG_SIS190 is not set CONFIG_SKGE=m # CONFIG_SKGE_DEBUG is not set CONFIG_SKY2=m # CONFIG_SKY2_DEBUG is not set CONFIG_VIA_VELOCITY=m # CONFIG_TIGON3 is not set # CONFIG_BNX2 is not set # CONFIG_QLA3XXX is not set # CONFIG_ATL1 is not set # CONFIG_ATL1E is not set # CONFIG_ATL1C is not set # CONFIG_JME is not set # CONFIG_NETDEV_10000 is not set # CONFIG_TR is not set # # Wireless LAN # # CONFIG_WLAN_PRE80211 is not set # CONFIG_WLAN_80211 is not set # # Enable WiMAX (Networking options) to see the WiMAX drivers # # # USB Network Adapters # # CONFIG_USB_CATC is not set # CONFIG_USB_KAWETH is not set # CONFIG_USB_PEGASUS is not set # CONFIG_USB_RTL8150 is not set CONFIG_USB_USBNET=m CONFIG_USB_NET_AX8817X=m CONFIG_USB_NET_CDCETHER=m CONFIG_USB_NET_DM9601=m # CONFIG_USB_NET_SMSC95XX is not set CONFIG_USB_NET_GL620A=m CONFIG_USB_NET_NET1080=m # CONFIG_USB_NET_PLUSB is not set # CONFIG_USB_NET_MCS7830 is not set # CONFIG_USB_NET_RNDIS_HOST is not set CONFIG_USB_NET_CDC_SUBSET=m CONFIG_USB_ALI_M5632=y CONFIG_USB_AN2720=y CONFIG_USB_BELKIN=y CONFIG_USB_ARMLINUX=y CONFIG_USB_EPSON2888=y CONFIG_USB_KC2190=y # CONFIG_USB_NET_ZAURUS is not set CONFIG_NET_PCMCIA=y # CONFIG_PCMCIA_3C589 is not set # CONFIG_PCMCIA_3C574 is not set # CONFIG_PCMCIA_FMVJ18X is not set CONFIG_PCMCIA_PCNET=m CONFIG_PCMCIA_NMCLAN=m CONFIG_PCMCIA_SMC91C92=m # CONFIG_PCMCIA_XIRC2PS is not set # CONFIG_PCMCIA_AXNET is not set # CONFIG_WAN is not set CONFIG_FDDI=y # CONFIG_DEFXX is not set # CONFIG_SKFP is not set # CONFIG_HIPPI is not set CONFIG_PLIP=m CONFIG_PPP=m CONFIG_PPP_MULTILINK=y CONFIG_PPP_FILTER=y CONFIG_PPP_ASYNC=m CONFIG_PPP_SYNC_TTY=m CONFIG_PPP_DEFLATE=m # CONFIG_PPP_BSDCOMP is not set # CONFIG_PPP_MPPE is not set CONFIG_PPPOE=m # CONFIG_PPPOL2TP is not set CONFIG_SLIP=m CONFIG_SLIP_COMPRESSED=y CONFIG_SLHC=m CONFIG_SLIP_SMART=y # CONFIG_SLIP_MODE_SLIP6 is not set CONFIG_NET_FC=y CONFIG_NETCONSOLE=m # CONFIG_NETCONSOLE_DYNAMIC is not set CONFIG_NETPOLL=y CONFIG_NETPOLL_TRAP=y CONFIG_NET_POLL_CONTROLLER=y # CONFIG_ISDN is not set # CONFIG_PHONE is not set # # Input device support # CONFIG_INPUT=y CONFIG_INPUT_FF_MEMLESS=y CONFIG_INPUT_POLLDEV=m # # Userland interfaces # CONFIG_INPUT_MOUSEDEV=y # CONFIG_INPUT_MOUSEDEV_PSAUX is not set CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 # CONFIG_INPUT_JOYDEV is not set CONFIG_INPUT_EVDEV=y # CONFIG_INPUT_EVBUG is not set # # Input Device Drivers # CONFIG_INPUT_KEYBOARD=y CONFIG_KEYBOARD_ATKBD=y # CONFIG_KEYBOARD_SUNKBD is not set # CONFIG_KEYBOARD_LKKBD is not set # CONFIG_KEYBOARD_XTKBD is not set # CONFIG_KEYBOARD_NEWTON is not set # CONFIG_KEYBOARD_STOWAWAY is not set CONFIG_INPUT_MOUSE=y CONFIG_MOUSE_PS2=y CONFIG_MOUSE_PS2_ALPS=y CONFIG_MOUSE_PS2_LOGIPS2PP=y CONFIG_MOUSE_PS2_SYNAPTICS=y CONFIG_MOUSE_PS2_LIFEBOOK=y CONFIG_MOUSE_PS2_TRACKPOINT=y # CONFIG_MOUSE_PS2_ELANTECH is not set # CONFIG_MOUSE_PS2_TOUCHKIT is not set CONFIG_MOUSE_SERIAL=m CONFIG_MOUSE_APPLETOUCH=m # CONFIG_MOUSE_BCM5974 is not set # CONFIG_MOUSE_INPORT is not set # CONFIG_MOUSE_LOGIBM is not set # CONFIG_MOUSE_PC110PAD is not set CONFIG_MOUSE_VSXXXAA=m # CONFIG_INPUT_JOYSTICK is not set # CONFIG_INPUT_TABLET is not set # CONFIG_INPUT_TOUCHSCREEN is not set CONFIG_INPUT_MISC=y # CONFIG_INPUT_PCSPKR is not set # CONFIG_INPUT_APANEL is not set # CONFIG_INPUT_WISTRON_BTNS is not set # CONFIG_INPUT_ATLAS_BTNS is not set # CONFIG_INPUT_ATI_REMOTE is not set # CONFIG_INPUT_ATI_REMOTE2 is not set # CONFIG_INPUT_KEYSPAN_REMOTE is not set # CONFIG_INPUT_POWERMATE is not set # CONFIG_INPUT_YEALINK is not set # CONFIG_INPUT_CM109 is not set CONFIG_INPUT_UINPUT=m # # Hardware I/O ports # CONFIG_SERIO=y CONFIG_SERIO_I8042=y CONFIG_SERIO_SERPORT=y # CONFIG_SERIO_CT82C710 is not set # CONFIG_SERIO_PARKBD is not set # CONFIG_SERIO_PCIPS2 is not set CONFIG_SERIO_LIBPS2=y CONFIG_SERIO_RAW=m # CONFIG_GAMEPORT is not set # # Character devices # CONFIG_VT=y CONFIG_CONSOLE_TRANSLATIONS=y CONFIG_VT_CONSOLE=y CONFIG_HW_CONSOLE=y CONFIG_VT_HW_CONSOLE_BINDING=y CONFIG_DEVKMEM=y CONFIG_SERIAL_NONSTANDARD=y # CONFIG_COMPUTONE is not set CONFIG_ROCKETPORT=m CONFIG_CYCLADES=m # CONFIG_CYZ_INTR is not set # CONFIG_DIGIEPCA is not set # CONFIG_MOXA_INTELLIO is not set # CONFIG_MOXA_SMARTIO is not set # CONFIG_ISI is not set # CONFIG_SYNCLINK is not set CONFIG_SYNCLINKMP=m CONFIG_SYNCLINK_GT=m # CONFIG_N_HDLC is not set # CONFIG_RISCOM8 is not set # CONFIG_SPECIALIX is not set # CONFIG_SX is not set # CONFIG_RIO is not set # CONFIG_STALDRV is not set # CONFIG_NOZOMI is not set # # Serial drivers # CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_FIX_EARLYCON_MEM=y CONFIG_SERIAL_8250_PCI=y CONFIG_SERIAL_8250_PNP=y CONFIG_SERIAL_8250_CS=m CONFIG_SERIAL_8250_NR_UARTS=32 CONFIG_SERIAL_8250_RUNTIME_UARTS=4 CONFIG_SERIAL_8250_EXTENDED=y CONFIG_SERIAL_8250_MANY_PORTS=y # CONFIG_SERIAL_8250_FOURPORT is not set # CONFIG_SERIAL_8250_ACCENT is not set # CONFIG_SERIAL_8250_BOCA is not set # CONFIG_SERIAL_8250_EXAR_ST16C554 is not set # CONFIG_SERIAL_8250_HUB6 is not set CONFIG_SERIAL_8250_SHARE_IRQ=y CONFIG_SERIAL_8250_DETECT_IRQ=y CONFIG_SERIAL_8250_RSA=y # # Non-8250 serial port support # CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y CONFIG_SERIAL_JSM=m CONFIG_UNIX98_PTYS=y # CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set # CONFIG_LEGACY_PTYS is not set CONFIG_PRINTER=m CONFIG_LP_CONSOLE=y CONFIG_PPDEV=m CONFIG_IPMI_HANDLER=m # CONFIG_IPMI_PANIC_EVENT is not set CONFIG_IPMI_DEVICE_INTERFACE=m CONFIG_IPMI_SI=m CONFIG_IPMI_WATCHDOG=m CONFIG_IPMI_POWEROFF=m CONFIG_HW_RANDOM=y # CONFIG_HW_RANDOM_TIMERIOMEM is not set CONFIG_HW_RANDOM_INTEL=m CONFIG_HW_RANDOM_AMD=m CONFIG_HW_RANDOM_GEODE=m CONFIG_HW_RANDOM_VIA=m CONFIG_NVRAM=y CONFIG_RTC=y # CONFIG_DTLK is not set # CONFIG_R3964 is not set # CONFIG_APPLICOM is not set # CONFIG_SONYPI is not set # # PCMCIA character devices # # CONFIG_SYNCLINK_CS is not set CONFIG_CARDMAN_4000=m CONFIG_CARDMAN_4040=m # CONFIG_IPWIRELESS is not set CONFIG_MWAVE=m # CONFIG_PC8736x_GPIO is not set # CONFIG_NSC_GPIO is not set # CONFIG_CS5535_GPIO is not set # CONFIG_RAW_DRIVER is not set CONFIG_HPET=y # CONFIG_HPET_MMAP is not set CONFIG_HANGCHECK_TIMER=m # CONFIG_TCG_TPM is not set # CONFIG_TELCLOCK is not set CONFIG_DEVPORT=y CONFIG_I2C=m CONFIG_I2C_BOARDINFO=y CONFIG_I2C_CHARDEV=m CONFIG_I2C_HELPER_AUTO=y CONFIG_I2C_ALGOBIT=m CONFIG_I2C_ALGOPCA=m # # I2C Hardware Bus support # # # PC SMBus host controller drivers # CONFIG_I2C_ALI1535=m CONFIG_I2C_ALI1563=m CONFIG_I2C_ALI15X3=m CONFIG_I2C_AMD756=m CONFIG_I2C_AMD756_S4882=m # CONFIG_I2C_AMD8111 is not set CONFIG_I2C_I801=m # CONFIG_I2C_ISCH is not set CONFIG_I2C_PIIX4=m CONFIG_I2C_NFORCE2=m # CONFIG_I2C_NFORCE2_S4985 is not set # CONFIG_I2C_SIS5595 is not set # CONFIG_I2C_SIS630 is not set # CONFIG_I2C_SIS96X is not set CONFIG_I2C_VIA=m CONFIG_I2C_VIAPRO=m # # I2C system bus drivers (mostly embedded / system-on-chip) # # CONFIG_I2C_OCORES is not set CONFIG_I2C_SIMTEC=m # # External I2C/SMBus adapter drivers # CONFIG_I2C_PARPORT=m CONFIG_I2C_PARPORT_LIGHT=m # CONFIG_I2C_TAOS_EVM is not set # CONFIG_I2C_TINY_USB is not set # # Graphics adapter I2C/DDC channel drivers # CONFIG_I2C_VOODOO3=m # # Other I2C/SMBus bus drivers # CONFIG_I2C_PCA_ISA=m # CONFIG_I2C_PCA_PLATFORM is not set CONFIG_I2C_STUB=m # CONFIG_SCx200_ACB is not set # # Miscellaneous I2C Chip support # # CONFIG_DS1682 is not set # CONFIG_SENSORS_PCF8574 is not set # CONFIG_PCF8575 is not set # CONFIG_SENSORS_PCA9539 is not set CONFIG_SENSORS_MAX6875=m # CONFIG_SENSORS_TSL2550 is not set # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set # CONFIG_I2C_DEBUG_CHIP is not set # CONFIG_SPI is not set CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y # CONFIG_GPIOLIB is not set # CONFIG_W1 is not set CONFIG_POWER_SUPPLY=y # CONFIG_POWER_SUPPLY_DEBUG is not set # CONFIG_PDA_POWER is not set # CONFIG_BATTERY_DS2760 is not set # CONFIG_BATTERY_BQ27x00 is not set CONFIG_HWMON=m CONFIG_HWMON_VID=m # CONFIG_SENSORS_ABITUGURU is not set # CONFIG_SENSORS_ABITUGURU3 is not set # CONFIG_SENSORS_AD7414 is not set CONFIG_SENSORS_AD7418=m # CONFIG_SENSORS_ADM1021 is not set # CONFIG_SENSORS_ADM1025 is not set # CONFIG_SENSORS_ADM1026 is not set # CONFIG_SENSORS_ADM1029 is not set # CONFIG_SENSORS_ADM1031 is not set # CONFIG_SENSORS_ADM9240 is not set # CONFIG_SENSORS_ADT7462 is not set # CONFIG_SENSORS_ADT7470 is not set # CONFIG_SENSORS_ADT7473 is not set # CONFIG_SENSORS_ADT7475 is not set # CONFIG_SENSORS_K8TEMP is not set # CONFIG_SENSORS_ASB100 is not set # CONFIG_SENSORS_ATK0110 is not set # CONFIG_SENSORS_ATXP1 is not set # CONFIG_SENSORS_DS1621 is not set # CONFIG_SENSORS_I5K_AMB is not set # CONFIG_SENSORS_F71805F is not set # CONFIG_SENSORS_F71882FG is not set # CONFIG_SENSORS_F75375S is not set # CONFIG_SENSORS_FSCHER is not set # CONFIG_SENSORS_FSCPOS is not set # CONFIG_SENSORS_FSCHMD is not set # CONFIG_SENSORS_G760A is not set # CONFIG_SENSORS_GL518SM is not set # CONFIG_SENSORS_GL520SM is not set CONFIG_SENSORS_CORETEMP=m # CONFIG_SENSORS_IBMAEM is not set # CONFIG_SENSORS_IBMPEX is not set # CONFIG_SENSORS_IT87 is not set # CONFIG_SENSORS_LM63 is not set # CONFIG_SENSORS_LM75 is not set # CONFIG_SENSORS_LM77 is not set # CONFIG_SENSORS_LM78 is not set # CONFIG_SENSORS_LM80 is not set # CONFIG_SENSORS_LM83 is not set # CONFIG_SENSORS_LM85 is not set # CONFIG_SENSORS_LM87 is not set # CONFIG_SENSORS_LM90 is not set # CONFIG_SENSORS_LM92 is not set # CONFIG_SENSORS_LM93 is not set # CONFIG_SENSORS_LTC4215 is not set # CONFIG_SENSORS_LTC4245 is not set # CONFIG_SENSORS_LM95241 is not set # CONFIG_SENSORS_MAX1619 is not set # CONFIG_SENSORS_MAX6650 is not set # CONFIG_SENSORS_PC87360 is not set # CONFIG_SENSORS_PC87427 is not set # CONFIG_SENSORS_PCF8591 is not set CONFIG_SENSORS_SIS5595=m # CONFIG_SENSORS_DME1737 is not set # CONFIG_SENSORS_SMSC47M1 is not set # CONFIG_SENSORS_SMSC47M192 is not set # CONFIG_SENSORS_SMSC47B397 is not set # CONFIG_SENSORS_ADS7828 is not set # CONFIG_SENSORS_THMC50 is not set CONFIG_SENSORS_VIA686A=m CONFIG_SENSORS_VT1211=m CONFIG_SENSORS_VT8231=m # CONFIG_SENSORS_W83781D is not set # CONFIG_SENSORS_W83791D is not set # CONFIG_SENSORS_W83792D is not set # CONFIG_SENSORS_W83793 is not set # CONFIG_SENSORS_W83L785TS is not set # CONFIG_SENSORS_W83L786NG is not set # CONFIG_SENSORS_W83627HF is not set # CONFIG_SENSORS_W83627EHF is not set CONFIG_SENSORS_HDAPS=m # CONFIG_SENSORS_LIS3LV02D is not set # CONFIG_SENSORS_APPLESMC is not set # CONFIG_HWMON_DEBUG_CHIP is not set CONFIG_THERMAL=y # CONFIG_WATCHDOG is not set CONFIG_SSB_POSSIBLE=y # # Sonics Silicon Backplane # CONFIG_SSB=m CONFIG_SSB_SPROM=y CONFIG_SSB_PCIHOST_POSSIBLE=y CONFIG_SSB_PCIHOST=y # CONFIG_SSB_B43_PCI_BRIDGE is not set CONFIG_SSB_PCMCIAHOST_POSSIBLE=y CONFIG_SSB_PCMCIAHOST=y # CONFIG_SSB_DEBUG is not set CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y CONFIG_SSB_DRIVER_PCICORE=y # # Multifunction device drivers # # CONFIG_MFD_CORE is not set # CONFIG_MFD_SM501 is not set # CONFIG_HTC_PASIC3 is not set # CONFIG_MFD_TMIO is not set # CONFIG_MFD_WM8400 is not set # CONFIG_MFD_WM8350_I2C is not set # CONFIG_MFD_PCF50633 is not set # CONFIG_REGULATOR is not set # # Multimedia devices # # # Multimedia core support # CONFIG_VIDEO_DEV=m CONFIG_VIDEO_V4L2_COMMON=m CONFIG_VIDEO_ALLOW_V4L1=y CONFIG_VIDEO_V4L1_COMPAT=y # CONFIG_DVB_CORE is not set CONFIG_VIDEO_MEDIA=m # # Multimedia drivers # # CONFIG_MEDIA_ATTACH is not set CONFIG_MEDIA_TUNER=m # CONFIG_MEDIA_TUNER_CUSTOMISE is not set CONFIG_MEDIA_TUNER_SIMPLE=m CONFIG_MEDIA_TUNER_TDA8290=m CONFIG_MEDIA_TUNER_TDA9887=m CONFIG_MEDIA_TUNER_TEA5761=m CONFIG_MEDIA_TUNER_TEA5767=m CONFIG_MEDIA_TUNER_MT20XX=m CONFIG_MEDIA_TUNER_XC2028=m CONFIG_MEDIA_TUNER_XC5000=m CONFIG_MEDIA_TUNER_MC44S803=m CONFIG_VIDEO_V4L2=m CONFIG_VIDEO_V4L1=m CONFIG_VIDEOBUF_GEN=m CONFIG_VIDEOBUF_DMA_SG=m CONFIG_VIDEO_BTCX=m CONFIG_VIDEO_IR=m CONFIG_VIDEO_TVEEPROM=m CONFIG_VIDEO_TUNER=m CONFIG_VIDEO_CAPTURE_DRIVERS=y # CONFIG_VIDEO_ADV_DEBUG is not set # CONFIG_VIDEO_FIXED_MINOR_RANGES is not set # CONFIG_VIDEO_HELPER_CHIPS_AUTO is not set CONFIG_VIDEO_IR_I2C=m # # Encoders/decoders and other helper chips # # # Audio decoders # CONFIG_VIDEO_TVAUDIO=m CONFIG_VIDEO_TDA7432=m CONFIG_VIDEO_TDA9840=m CONFIG_VIDEO_TDA9875=m CONFIG_VIDEO_TEA6415C=m CONFIG_VIDEO_TEA6420=m CONFIG_VIDEO_MSP3400=m # CONFIG_VIDEO_CS5345 is not set CONFIG_VIDEO_CS53L32A=m CONFIG_VIDEO_M52790=m CONFIG_VIDEO_TLV320AIC23B=m CONFIG_VIDEO_WM8775=m CONFIG_VIDEO_WM8739=m CONFIG_VIDEO_VP27SMPX=m # # RDS decoders # # CONFIG_VIDEO_SAA6588 is not set # # Video decoders # CONFIG_VIDEO_BT819=m CONFIG_VIDEO_BT856=m CONFIG_VIDEO_BT866=m CONFIG_VIDEO_KS0127=m CONFIG_VIDEO_OV7670=m # CONFIG_VIDEO_TCM825X is not set CONFIG_VIDEO_SAA7110=m CONFIG_VIDEO_SAA711X=m CONFIG_VIDEO_SAA717X=m CONFIG_VIDEO_SAA7191=m # CONFIG_VIDEO_TVP514X is not set CONFIG_VIDEO_TVP5150=m CONFIG_VIDEO_VPX3220=m # # Video and audio decoders # CONFIG_VIDEO_CX25840=m # # MPEG video encoders # CONFIG_VIDEO_CX2341X=m # # Video encoders # CONFIG_VIDEO_SAA7127=m CONFIG_VIDEO_SAA7185=m CONFIG_VIDEO_ADV7170=m CONFIG_VIDEO_ADV7175=m # # Video improvement chips # CONFIG_VIDEO_UPD64031A=m CONFIG_VIDEO_UPD64083=m # CONFIG_VIDEO_VIVI is not set CONFIG_VIDEO_BT848=m # CONFIG_VIDEO_PMS is not set # CONFIG_VIDEO_BWQCAM is not set # CONFIG_VIDEO_CQCAM is not set # CONFIG_VIDEO_W9966 is not set CONFIG_VIDEO_CPIA=m CONFIG_VIDEO_CPIA_PP=m CONFIG_VIDEO_CPIA_USB=m CONFIG_VIDEO_CPIA2=m # CONFIG_VIDEO_SAA5246A is not set # CONFIG_VIDEO_SAA5249 is not set # CONFIG_VIDEO_STRADIS is not set CONFIG_VIDEO_ZORAN=m # CONFIG_VIDEO_ZORAN_DC30 is not set CONFIG_VIDEO_ZORAN_ZR36060=m CONFIG_VIDEO_ZORAN_BUZ=m # CONFIG_VIDEO_ZORAN_DC10 is not set CONFIG_VIDEO_ZORAN_LML33=m # CONFIG_VIDEO_ZORAN_LML33R10 is not set # CONFIG_VIDEO_ZORAN_AVS6EYES is not set # CONFIG_VIDEO_SAA7134 is not set # CONFIG_VIDEO_MXB is not set # CONFIG_VIDEO_HEXIUM_ORION is not set # CONFIG_VIDEO_HEXIUM_GEMINI is not set # CONFIG_VIDEO_CX88 is not set CONFIG_VIDEO_IVTV=m # CONFIG_VIDEO_FB_IVTV is not set # CONFIG_VIDEO_CAFE_CCIC is not set # CONFIG_SOC_CAMERA is not set # CONFIG_V4L_USB_DRIVERS is not set CONFIG_RADIO_ADAPTERS=y # CONFIG_RADIO_CADET is not set # CONFIG_RADIO_RTRACK is not set # CONFIG_RADIO_RTRACK2 is not set # CONFIG_RADIO_AZTECH is not set # CONFIG_RADIO_GEMTEK is not set # CONFIG_RADIO_GEMTEK_PCI is not set CONFIG_RADIO_MAXIRADIO=m CONFIG_RADIO_MAESTRO=m # CONFIG_RADIO_SF16FMI is not set # CONFIG_RADIO_SF16FMR2 is not set # CONFIG_RADIO_TERRATEC is not set # CONFIG_RADIO_TRUST is not set # CONFIG_RADIO_TYPHOON is not set # CONFIG_RADIO_ZOLTRIX is not set CONFIG_USB_DSBR=m # CONFIG_USB_SI470X is not set # CONFIG_USB_MR800 is not set # CONFIG_RADIO_TEA5764 is not set CONFIG_DAB=y CONFIG_USB_DABUSB=m # # Graphics support # CONFIG_AGP=y CONFIG_AGP_ALI=y CONFIG_AGP_ATI=y # CONFIG_AGP_AMD is not set # CONFIG_AGP_AMD64 is not set CONFIG_AGP_INTEL=y CONFIG_AGP_NVIDIA=y CONFIG_AGP_SIS=y # CONFIG_AGP_SWORKS is not set CONFIG_AGP_VIA=y CONFIG_AGP_EFFICEON=y CONFIG_DRM=m CONFIG_DRM_TDFX=m CONFIG_DRM_R128=m CONFIG_DRM_RADEON=m CONFIG_DRM_I810=m CONFIG_DRM_I830=m CONFIG_DRM_I915=m # CONFIG_DRM_I915_KMS is not set # CONFIG_DRM_MGA is not set CONFIG_DRM_SIS=m # CONFIG_DRM_VIA is not set # CONFIG_DRM_SAVAGE is not set CONFIG_VGASTATE=m CONFIG_VIDEO_OUTPUT_CONTROL=m CONFIG_FB=y # CONFIG_FIRMWARE_EDID is not set CONFIG_FB_DDC=m CONFIG_FB_BOOT_VESA_SUPPORT=y CONFIG_FB_CFB_FILLRECT=y CONFIG_FB_CFB_COPYAREA=y CONFIG_FB_CFB_IMAGEBLIT=y # CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set # CONFIG_FB_SYS_FILLRECT is not set # CONFIG_FB_SYS_COPYAREA is not set # CONFIG_FB_SYS_IMAGEBLIT is not set # CONFIG_FB_FOREIGN_ENDIAN is not set # CONFIG_FB_SYS_FOPS is not set CONFIG_FB_SVGALIB=m # CONFIG_FB_MACMODES is not set CONFIG_FB_BACKLIGHT=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y # # Frame buffer hardware drivers # # CONFIG_FB_CIRRUS is not set # CONFIG_FB_PM2 is not set # CONFIG_FB_CYBER2000 is not set # CONFIG_FB_ARC is not set # CONFIG_FB_ASILIANT is not set # CONFIG_FB_IMSTT is not set # CONFIG_FB_VGA16 is not set CONFIG_FB_VESA=y # CONFIG_FB_EFI is not set # CONFIG_FB_N411 is not set # CONFIG_FB_HGA is not set # CONFIG_FB_S1D13XXX is not set CONFIG_FB_NVIDIA=m CONFIG_FB_NVIDIA_I2C=y # CONFIG_FB_NVIDIA_DEBUG is not set CONFIG_FB_NVIDIA_BACKLIGHT=y # CONFIG_FB_RIVA is not set # CONFIG_FB_I810 is not set # CONFIG_FB_LE80578 is not set # CONFIG_FB_INTEL is not set # CONFIG_FB_MATROX is not set CONFIG_FB_RADEON=m CONFIG_FB_RADEON_I2C=y CONFIG_FB_RADEON_BACKLIGHT=y # CONFIG_FB_RADEON_DEBUG is not set # CONFIG_FB_ATY128 is not set # CONFIG_FB_ATY is not set CONFIG_FB_S3=m CONFIG_FB_SAVAGE=m CONFIG_FB_SAVAGE_I2C=y CONFIG_FB_SAVAGE_ACCEL=y # CONFIG_FB_SIS is not set # CONFIG_FB_VIA is not set # CONFIG_FB_NEOMAGIC is not set # CONFIG_FB_KYRO is not set # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set # CONFIG_FB_VT8623 is not set CONFIG_FB_TRIDENT=m # CONFIG_FB_ARK is not set # CONFIG_FB_PM3 is not set # CONFIG_FB_CARMINE is not set # CONFIG_FB_GEODE is not set # CONFIG_FB_VIRTUAL is not set # CONFIG_FB_METRONOME is not set # CONFIG_FB_MB862XX is not set # CONFIG_FB_BROADSHEET is not set CONFIG_BACKLIGHT_LCD_SUPPORT=y CONFIG_LCD_CLASS_DEVICE=m # CONFIG_LCD_ILI9320 is not set # CONFIG_LCD_PLATFORM is not set CONFIG_BACKLIGHT_CLASS_DEVICE=y CONFIG_BACKLIGHT_GENERIC=y CONFIG_BACKLIGHT_PROGEAR=m # CONFIG_BACKLIGHT_MBP_NVIDIA is not set # CONFIG_BACKLIGHT_SAHARA is not set # # Display device support # CONFIG_DISPLAY_SUPPORT=m # # Display hardware drivers # # # Console display driver support # CONFIG_VGA_CONSOLE=y CONFIG_VGACON_SOFT_SCROLLBACK=y CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64 # CONFIG_MDA_CONSOLE is not set CONFIG_DUMMY_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y # CONFIG_FONTS is not set CONFIG_FONT_8x8=y CONFIG_FONT_8x16=y CONFIG_LOGO=y # CONFIG_LOGO_LINUX_MONO is not set # CONFIG_LOGO_LINUX_VGA16 is not set CONFIG_LOGO_LINUX_CLUT224=y # CONFIG_SOUND is not set # CONFIG_HID_SUPPORT is not set CONFIG_USB_SUPPORT=y CONFIG_USB_ARCH_HAS_HCD=y CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_USB_ARCH_HAS_EHCI=y CONFIG_USB=y # CONFIG_USB_DEBUG is not set # CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set # # Miscellaneous USB options # CONFIG_USB_DEVICEFS=y # CONFIG_USB_DEVICE_CLASS is not set # CONFIG_USB_DYNAMIC_MINORS is not set CONFIG_USB_SUSPEND=y # CONFIG_USB_OTG is not set # CONFIG_USB_MON is not set # CONFIG_USB_WUSB is not set # CONFIG_USB_WUSB_CBAF is not set # # USB Host Controller Drivers # # CONFIG_USB_C67X00_HCD is not set CONFIG_USB_EHCI_HCD=m CONFIG_USB_EHCI_ROOT_HUB_TT=y CONFIG_USB_EHCI_TT_NEWSCHED=y # CONFIG_USB_OXU210HP_HCD is not set # CONFIG_USB_ISP116X_HCD is not set # CONFIG_USB_ISP1760_HCD is not set CONFIG_USB_OHCI_HCD=m # CONFIG_USB_OHCI_HCD_SSB is not set # CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set # CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set CONFIG_USB_OHCI_LITTLE_ENDIAN=y CONFIG_USB_UHCI_HCD=m # CONFIG_USB_U132_HCD is not set # CONFIG_USB_SL811_HCD is not set # CONFIG_USB_R8A66597_HCD is not set # CONFIG_USB_WHCI_HCD is not set # CONFIG_USB_HWA_HCD is not set # # USB Device Class drivers # # CONFIG_USB_ACM is not set # CONFIG_USB_PRINTER is not set # CONFIG_USB_WDM is not set # CONFIG_USB_TMC is not set # # NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may # # # also be needed; see USB_STORAGE Help for more info # CONFIG_USB_STORAGE=m # CONFIG_USB_STORAGE_DEBUG is not set CONFIG_USB_STORAGE_DATAFAB=m CONFIG_USB_STORAGE_FREECOM=m # CONFIG_USB_STORAGE_ISD200 is not set CONFIG_USB_STORAGE_USBAT=m # CONFIG_USB_STORAGE_SDDR09 is not set # CONFIG_USB_STORAGE_SDDR55 is not set # CONFIG_USB_STORAGE_JUMPSHOT is not set # CONFIG_USB_STORAGE_ALAUDA is not set # CONFIG_USB_STORAGE_ONETOUCH is not set # CONFIG_USB_STORAGE_KARMA is not set # CONFIG_USB_STORAGE_CYPRESS_ATACB is not set # CONFIG_USB_LIBUSUAL is not set # # USB Imaging devices # # CONFIG_USB_MDC800 is not set # CONFIG_USB_MICROTEK is not set # # USB port drivers # # CONFIG_USB_USS720 is not set CONFIG_USB_SERIAL=m CONFIG_USB_EZUSB=y CONFIG_USB_SERIAL_GENERIC=y # CONFIG_USB_SERIAL_AIRCABLE is not set # CONFIG_USB_SERIAL_ARK3116 is not set # CONFIG_USB_SERIAL_BELKIN is not set # CONFIG_USB_SERIAL_CH341 is not set # CONFIG_USB_SERIAL_WHITEHEAT is not set # CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set # CONFIG_USB_SERIAL_CP210X is not set # CONFIG_USB_SERIAL_CYPRESS_M8 is not set CONFIG_USB_SERIAL_EMPEG=m # CONFIG_USB_SERIAL_FTDI_SIO is not set # CONFIG_USB_SERIAL_FUNSOFT is not set # CONFIG_USB_SERIAL_VISOR is not set # CONFIG_USB_SERIAL_IPAQ is not set # CONFIG_USB_SERIAL_IR is not set # CONFIG_USB_SERIAL_EDGEPORT is not set # CONFIG_USB_SERIAL_EDGEPORT_TI is not set # CONFIG_USB_SERIAL_GARMIN is not set # CONFIG_USB_SERIAL_IPW is not set # CONFIG_USB_SERIAL_IUU is not set # CONFIG_USB_SERIAL_KEYSPAN_PDA is not set CONFIG_USB_SERIAL_KEYSPAN=m # CONFIG_USB_SERIAL_KEYSPAN_MPR is not set # CONFIG_USB_SERIAL_KEYSPAN_USA28 is not set # CONFIG_USB_SERIAL_KEYSPAN_USA28X is not set # CONFIG_USB_SERIAL_KEYSPAN_USA28XA is not set # CONFIG_USB_SERIAL_KEYSPAN_USA28XB is not set # CONFIG_USB_SERIAL_KEYSPAN_USA19 is not set # CONFIG_USB_SERIAL_KEYSPAN_USA18X is not set # CONFIG_USB_SERIAL_KEYSPAN_USA19W is not set CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y CONFIG_USB_SERIAL_KEYSPAN_USA49W=y CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y # CONFIG_USB_SERIAL_KLSI is not set # CONFIG_USB_SERIAL_KOBIL_SCT is not set # CONFIG_USB_SERIAL_MCT_U232 is not set # CONFIG_USB_SERIAL_MOS7720 is not set # CONFIG_USB_SERIAL_MOS7840 is not set # CONFIG_USB_SERIAL_MOTOROLA is not set # CONFIG_USB_SERIAL_NAVMAN is not set # CONFIG_USB_SERIAL_PL2303 is not set # CONFIG_USB_SERIAL_OTI6858 is not set # CONFIG_USB_SERIAL_QUALCOMM is not set # CONFIG_USB_SERIAL_SPCP8X5 is not set # CONFIG_USB_SERIAL_HP4X is not set # CONFIG_USB_SERIAL_SAFE is not set # CONFIG_USB_SERIAL_SIEMENS_MPI is not set # CONFIG_USB_SERIAL_SIERRAWIRELESS is not set # CONFIG_USB_SERIAL_SYMBOL is not set # CONFIG_USB_SERIAL_TI is not set # CONFIG_USB_SERIAL_CYBERJACK is not set # CONFIG_USB_SERIAL_XIRCOM is not set # CONFIG_USB_SERIAL_OPTION is not set # CONFIG_USB_SERIAL_OMNINET is not set # CONFIG_USB_SERIAL_OPTICON is not set # CONFIG_USB_SERIAL_DEBUG is not set # # USB Miscellaneous drivers # # CONFIG_USB_EMI62 is not set # CONFIG_USB_EMI26 is not set # CONFIG_USB_ADUTUX is not set # CONFIG_USB_SEVSEG is not set # CONFIG_USB_RIO500 is not set # CONFIG_USB_LEGOTOWER is not set # CONFIG_USB_LCD is not set # CONFIG_USB_BERRY_CHARGE is not set # CONFIG_USB_LED is not set # CONFIG_USB_CYPRESS_CY7C63 is not set # CONFIG_USB_CYTHERM is not set # CONFIG_USB_IDMOUSE is not set CONFIG_USB_FTDI_ELAN=m # CONFIG_USB_APPLEDISPLAY is not set # CONFIG_USB_SISUSBVGA is not set # CONFIG_USB_LD is not set # CONFIG_USB_TRANCEVIBRATOR is not set # CONFIG_USB_IOWARRIOR is not set # CONFIG_USB_TEST is not set # CONFIG_USB_ISIGHTFW is not set # CONFIG_USB_VST is not set # CONFIG_USB_GADGET is not set # # OTG and related infrastructure # # CONFIG_NOP_USB_XCEIV is not set # CONFIG_UWB is not set # CONFIG_MMC is not set # CONFIG_MEMSTICK is not set CONFIG_NEW_LEDS=y CONFIG_LEDS_CLASS=y # # LED drivers # # CONFIG_LEDS_ALIX2 is not set # CONFIG_LEDS_PCA9532 is not set # CONFIG_LEDS_LP5521 is not set # CONFIG_LEDS_CLEVO_MAIL is not set # CONFIG_LEDS_PCA955X is not set # CONFIG_LEDS_BD2802 is not set # # LED Triggers # CONFIG_LEDS_TRIGGERS=y CONFIG_LEDS_TRIGGER_TIMER=m # CONFIG_LEDS_TRIGGER_HEARTBEAT is not set # CONFIG_LEDS_TRIGGER_BACKLIGHT is not set # CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set # # iptables trigger is under Netfilter config (LED target) # # CONFIG_ACCESSIBILITY is not set # CONFIG_INFINIBAND is not set # CONFIG_EDAC is not set # CONFIG_RTC_CLASS is not set # CONFIG_DMADEVICES is not set # CONFIG_AUXDISPLAY is not set CONFIG_UIO=m # CONFIG_UIO_CIF is not set # CONFIG_UIO_PDRV is not set # CONFIG_UIO_PDRV_GENIRQ is not set # CONFIG_UIO_SMX is not set # CONFIG_UIO_AEC is not set # CONFIG_UIO_SERCOS3 is not set # CONFIG_STAGING is not set CONFIG_X86_PLATFORM_DEVICES=y # CONFIG_ASUS_LAPTOP is not set # CONFIG_FUJITSU_LAPTOP is not set # CONFIG_TC1100_WMI is not set # CONFIG_MSI_LAPTOP is not set # CONFIG_PANASONIC_LAPTOP is not set # CONFIG_COMPAL_LAPTOP is not set # CONFIG_THINKPAD_ACPI is not set # CONFIG_INTEL_MENLOW is not set # CONFIG_EEEPC_LAPTOP is not set # CONFIG_ACPI_WMI is not set # CONFIG_ACPI_ASUS is not set # CONFIG_ACPI_TOSHIBA is not set # # Firmware Drivers # CONFIG_EDD=m # CONFIG_EDD_OFF is not set CONFIG_FIRMWARE_MEMMAP=y CONFIG_EFI_VARS=y # CONFIG_DELL_RBU is not set # CONFIG_DCDBAS is not set CONFIG_DMIID=y # CONFIG_ISCSI_IBFT_FIND is not set # # File systems # CONFIG_EXT2_FS=m # CONFIG_EXT2_FS_XATTR is not set CONFIG_EXT2_FS_XIP=y CONFIG_EXT3_FS=m # CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y CONFIG_EXT3_FS_SECURITY=y CONFIG_EXT4_FS=m CONFIG_EXT4DEV_COMPAT=y CONFIG_EXT4_FS_XATTR=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_EXT4_FS_SECURITY=y CONFIG_FS_XIP=y CONFIG_JBD=m # CONFIG_JBD_DEBUG is not set CONFIG_JBD2=m # CONFIG_JBD2_DEBUG is not set CONFIG_FS_MBCACHE=m # CONFIG_REISERFS_FS is not set # CONFIG_JFS_FS is not set CONFIG_FS_POSIX_ACL=y CONFIG_FILE_LOCKING=y # CONFIG_XFS_FS is not set # CONFIG_GFS2_FS is not set # CONFIG_OCFS2_FS is not set # CONFIG_BTRFS_FS is not set CONFIG_DNOTIFY=y CONFIG_INOTIFY=y CONFIG_INOTIFY_USER=y # CONFIG_QUOTA is not set # CONFIG_AUTOFS_FS is not set CONFIG_AUTOFS4_FS=m CONFIG_FUSE_FS=m CONFIG_GENERIC_ACL=y # # Caches # # CONFIG_FSCACHE is not set # # CD-ROM/DVD Filesystems # CONFIG_ISO9660_FS=y CONFIG_JOLIET=y CONFIG_ZISOFS=y CONFIG_UDF_FS=y CONFIG_UDF_NLS=y # # DOS/FAT/NT Filesystems # CONFIG_FAT_FS=m CONFIG_MSDOS_FS=m CONFIG_VFAT_FS=m CONFIG_FAT_DEFAULT_CODEPAGE=437 CONFIG_FAT_DEFAULT_IOCHARSET="ascii" # CONFIG_NTFS_FS is not set # # Pseudo filesystems # CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_PROC_VMCORE=y CONFIG_PROC_SYSCTL=y CONFIG_PROC_PAGE_MONITOR=y CONFIG_SYSFS=y CONFIG_TMPFS=y CONFIG_TMPFS_POSIX_ACL=y CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_CONFIGFS_FS=m CONFIG_MISC_FILESYSTEMS=y # CONFIG_ADFS_FS is not set # CONFIG_AFFS_FS is not set # CONFIG_HFS_FS is not set # CONFIG_HFSPLUS_FS is not set # CONFIG_BEFS_FS is not set # CONFIG_BFS_FS is not set # CONFIG_EFS_FS is not set CONFIG_CRAMFS=m # CONFIG_SQUASHFS is not set # CONFIG_VXFS_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_OMFS_FS is not set # CONFIG_HPFS_FS is not set # CONFIG_QNX4FS_FS is not set CONFIG_ROMFS_FS=m CONFIG_ROMFS_BACKED_BY_BLOCK=y # CONFIG_ROMFS_BACKED_BY_MTD is not set # CONFIG_ROMFS_BACKED_BY_BOTH is not set CONFIG_ROMFS_ON_BLOCK=y # CONFIG_SYSV_FS is not set CONFIG_UFS_FS=m # CONFIG_UFS_FS_WRITE is not set # CONFIG_UFS_DEBUG is not set # CONFIG_NILFS2_FS is not set CONFIG_NETWORK_FILESYSTEMS=y CONFIG_NFS_FS=m CONFIG_NFS_V3=y CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y # CONFIG_NFSD is not set CONFIG_LOCKD=m CONFIG_LOCKD_V4=y CONFIG_NFS_ACL_SUPPORT=m CONFIG_NFS_COMMON=y CONFIG_SUNRPC=m CONFIG_SUNRPC_GSS=m CONFIG_RPCSEC_GSS_KRB5=m # CONFIG_RPCSEC_GSS_SPKM3 is not set # CONFIG_SMB_FS is not set # CONFIG_CIFS is not set # CONFIG_NCP_FS is not set # CONFIG_CODA_FS is not set # CONFIG_AFS_FS is not set # # Partition Types # CONFIG_PARTITION_ADVANCED=y # CONFIG_ACORN_PARTITION is not set # CONFIG_OSF_PARTITION is not set # CONFIG_AMIGA_PARTITION is not set # CONFIG_ATARI_PARTITION is not set # CONFIG_MAC_PARTITION is not set CONFIG_MSDOS_PARTITION=y CONFIG_BSD_DISKLABEL=y # CONFIG_MINIX_SUBPARTITION is not set # CONFIG_SOLARIS_X86_PARTITION is not set # CONFIG_UNIXWARE_DISKLABEL is not set # CONFIG_LDM_PARTITION is not set # CONFIG_SGI_PARTITION is not set # CONFIG_ULTRIX_PARTITION is not set # CONFIG_SUN_PARTITION is not set # CONFIG_KARMA_PARTITION is not set CONFIG_EFI_PARTITION=y # CONFIG_SYSV68_PARTITION is not set CONFIG_NLS=y CONFIG_NLS_DEFAULT="utf8" CONFIG_NLS_CODEPAGE_437=y # CONFIG_NLS_CODEPAGE_737 is not set # CONFIG_NLS_CODEPAGE_775 is not set CONFIG_NLS_CODEPAGE_850=m CONFIG_NLS_CODEPAGE_852=m # CONFIG_NLS_CODEPAGE_855 is not set # CONFIG_NLS_CODEPAGE_857 is not set # CONFIG_NLS_CODEPAGE_860 is not set # CONFIG_NLS_CODEPAGE_861 is not set # CONFIG_NLS_CODEPAGE_862 is not set CONFIG_NLS_CODEPAGE_863=m # CONFIG_NLS_CODEPAGE_864 is not set # CONFIG_NLS_CODEPAGE_865 is not set # CONFIG_NLS_CODEPAGE_866 is not set # CONFIG_NLS_CODEPAGE_869 is not set CONFIG_NLS_CODEPAGE_936=m CONFIG_NLS_CODEPAGE_950=m CONFIG_NLS_CODEPAGE_932=m # CONFIG_NLS_CODEPAGE_949 is not set # CONFIG_NLS_CODEPAGE_874 is not set CONFIG_NLS_ISO8859_8=m CONFIG_NLS_CODEPAGE_1250=m CONFIG_NLS_CODEPAGE_1251=m CONFIG_NLS_ASCII=y # CONFIG_NLS_ISO8859_1 is not set # CONFIG_NLS_ISO8859_2 is not set # CONFIG_NLS_ISO8859_3 is not set # CONFIG_NLS_ISO8859_4 is not set # CONFIG_NLS_ISO8859_5 is not set # CONFIG_NLS_ISO8859_6 is not set # CONFIG_NLS_ISO8859_7 is not set # CONFIG_NLS_ISO8859_9 is not set # CONFIG_NLS_ISO8859_13 is not set # CONFIG_NLS_ISO8859_14 is not set # CONFIG_NLS_ISO8859_15 is not set # CONFIG_NLS_KOI8_R is not set # CONFIG_NLS_KOI8_U is not set CONFIG_NLS_UTF8=m # CONFIG_DLM is not set # # Kernel hacking # CONFIG_TRACE_IRQFLAGS_SUPPORT=y # CONFIG_PRINTK_TIME is not set # CONFIG_ENABLE_WARN_DEPRECATED is not set # CONFIG_ENABLE_MUST_CHECK is not set CONFIG_FRAME_WARN=1024 CONFIG_MAGIC_SYSRQ=y # CONFIG_UNUSED_SYMBOLS is not set CONFIG_DEBUG_FS=y CONFIG_HEADERS_CHECK=y CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_SHIRQ=y CONFIG_DETECT_SOFTLOCKUP=y # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0 CONFIG_DETECT_HUNG_TASK=y # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0 CONFIG_SCHED_DEBUG=y CONFIG_SCHEDSTATS=y CONFIG_TIMER_STATS=y # CONFIG_DEBUG_OBJECTS is not set # CONFIG_SLUB_DEBUG_ON is not set # CONFIG_SLUB_STATS is not set CONFIG_DEBUG_PREEMPT=y # CONFIG_DEBUG_RT_MUTEXES is not set # CONFIG_RT_MUTEX_TESTER is not set CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y CONFIG_LOCKDEP=y # CONFIG_LOCK_STAT is not set CONFIG_DEBUG_LOCKDEP=y CONFIG_TRACE_IRQFLAGS=y CONFIG_DEBUG_SPINLOCK_SLEEP=y # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set CONFIG_STACKTRACE=y # CONFIG_DEBUG_KOBJECT is not set CONFIG_DEBUG_HIGHMEM=y CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_INFO=y # CONFIG_DEBUG_VM is not set # CONFIG_DEBUG_VIRTUAL is not set # CONFIG_DEBUG_WRITECOUNT is not set CONFIG_DEBUG_MEMORY_INIT=y CONFIG_DEBUG_LIST=y # CONFIG_DEBUG_SG is not set # CONFIG_DEBUG_NOTIFIERS is not set CONFIG_ARCH_WANT_FRAME_POINTERS=y CONFIG_FRAME_POINTER=y # CONFIG_BOOT_PRINTK_DELAY is not set # CONFIG_RCU_TORTURE_TEST is not set # CONFIG_KPROBES_SANITY_TEST is not set # CONFIG_BACKTRACE_SELF_TEST is not set # CONFIG_DEBUG_BLOCK_EXT_DEVT is not set # CONFIG_LKDTM is not set # CONFIG_FAULT_INJECTION is not set # CONFIG_LATENCYTOP is not set CONFIG_SYSCTL_SYSCALL_CHECK=y # CONFIG_DEBUG_PAGEALLOC is not set CONFIG_USER_STACKTRACE_SUPPORT=y CONFIG_NOP_TRACER=y CONFIG_HAVE_FTRACE_NMI_ENTER=y CONFIG_HAVE_FUNCTION_TRACER=y CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y CONFIG_HAVE_DYNAMIC_FTRACE=y CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y CONFIG_HAVE_FTRACE_SYSCALLS=y CONFIG_TRACER_MAX_TRACE=y CONFIG_RING_BUFFER=y CONFIG_FTRACE_NMI_ENTER=y CONFIG_TRACING=y CONFIG_TRACING_SUPPORT=y # # Tracers # CONFIG_FUNCTION_TRACER=y CONFIG_FUNCTION_GRAPH_TRACER=y CONFIG_IRQSOFF_TRACER=y CONFIG_PREEMPT_TRACER=y CONFIG_SYSPROF_TRACER=y CONFIG_SCHED_TRACER=y CONFIG_CONTEXT_SWITCH_TRACER=y CONFIG_EVENT_TRACER=y CONFIG_FTRACE_SYSCALLS=y CONFIG_BOOT_TRACER=y # CONFIG_TRACE_BRANCH_PROFILING is not set CONFIG_POWER_TRACER=y CONFIG_STACK_TRACER=y # CONFIG_KMEMTRACE is not set CONFIG_WORKQUEUE_TRACER=y CONFIG_BLK_DEV_IO_TRACE=y CONFIG_DYNAMIC_FTRACE=y CONFIG_FTRACE_MCOUNT_RECORD=y # CONFIG_FTRACE_STARTUP_TEST is not set CONFIG_MMIOTRACE=y CONFIG_MMIOTRACE_TEST=m # CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set # CONFIG_FIREWIRE_OHCI_REMOTE_DMA is not set # CONFIG_BUILD_DOCSRC is not set # CONFIG_DYNAMIC_DEBUG is not set # CONFIG_DMA_API_DEBUG is not set CONFIG_SAMPLES=y # CONFIG_SAMPLE_MARKERS is not set # CONFIG_SAMPLE_TRACEPOINTS is not set CONFIG_SAMPLE_KOBJECT=m CONFIG_SAMPLE_KPROBES=m CONFIG_SAMPLE_KRETPROBES=m CONFIG_HAVE_ARCH_KGDB=y # CONFIG_KGDB is not set # CONFIG_STRICT_DEVMEM is not set CONFIG_X86_VERBOSE_BOOTUP=y CONFIG_EARLY_PRINTK=y # CONFIG_EARLY_PRINTK_DBGP is not set # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_DEBUG_STACK_USAGE is not set # CONFIG_DEBUG_PER_CPU_MAPS is not set # CONFIG_X86_PTDUMP is not set CONFIG_DEBUG_RODATA=y # CONFIG_DEBUG_RODATA_TEST is not set # CONFIG_DEBUG_NX_TEST is not set CONFIG_4KSTACKS=y CONFIG_DOUBLEFAULT=y CONFIG_HAVE_MMIOTRACE_SUPPORT=y CONFIG_IO_DELAY_TYPE_0X80=0 CONFIG_IO_DELAY_TYPE_0XED=1 CONFIG_IO_DELAY_TYPE_UDELAY=2 CONFIG_IO_DELAY_TYPE_NONE=3 CONFIG_IO_DELAY_0X80=y # CONFIG_IO_DELAY_0XED is not set # CONFIG_IO_DELAY_UDELAY is not set # CONFIG_IO_DELAY_NONE is not set CONFIG_DEFAULT_IO_DELAY_TYPE=0 # CONFIG_DEBUG_BOOT_PARAMS is not set # CONFIG_CPA_DEBUG is not set # CONFIG_OPTIMIZE_INLINING is not set # # Security options # # CONFIG_KEYS is not set # CONFIG_SECURITY is not set # CONFIG_SECURITYFS is not set # CONFIG_SECURITY_FILE_CAPABILITIES is not set # CONFIG_IMA is not set CONFIG_CRYPTO=y # # Crypto core or helper # # CONFIG_CRYPTO_FIPS is not set CONFIG_CRYPTO_ALGAPI=y CONFIG_CRYPTO_ALGAPI2=y CONFIG_CRYPTO_AEAD2=y CONFIG_CRYPTO_BLKCIPHER=m CONFIG_CRYPTO_BLKCIPHER2=y CONFIG_CRYPTO_HASH=y CONFIG_CRYPTO_HASH2=y CONFIG_CRYPTO_RNG2=y CONFIG_CRYPTO_PCOMP=y CONFIG_CRYPTO_MANAGER=y CONFIG_CRYPTO_MANAGER2=y # CONFIG_CRYPTO_GF128MUL is not set CONFIG_CRYPTO_NULL=m CONFIG_CRYPTO_WORKQUEUE=y # CONFIG_CRYPTO_CRYPTD is not set # CONFIG_CRYPTO_AUTHENC is not set # CONFIG_CRYPTO_TEST is not set # # Authenticated Encryption with Associated Data # # CONFIG_CRYPTO_CCM is not set # CONFIG_CRYPTO_GCM is not set # CONFIG_CRYPTO_SEQIV is not set # # Block modes # CONFIG_CRYPTO_CBC=m # CONFIG_CRYPTO_CTR is not set # CONFIG_CRYPTO_CTS is not set # CONFIG_CRYPTO_ECB is not set # CONFIG_CRYPTO_LRW is not set # CONFIG_CRYPTO_PCBC is not set # CONFIG_CRYPTO_XTS is not set # # Hash modes # # CONFIG_CRYPTO_HMAC is not set # CONFIG_CRYPTO_XCBC is not set # # Digest # CONFIG_CRYPTO_CRC32C=y # CONFIG_CRYPTO_CRC32C_INTEL is not set CONFIG_CRYPTO_MD4=m CONFIG_CRYPTO_MD5=y # CONFIG_CRYPTO_MICHAEL_MIC is not set # CONFIG_CRYPTO_RMD128 is not set # CONFIG_CRYPTO_RMD160 is not set # CONFIG_CRYPTO_RMD256 is not set # CONFIG_CRYPTO_RMD320 is not set CONFIG_CRYPTO_SHA1=y CONFIG_CRYPTO_SHA256=m # CONFIG_CRYPTO_SHA512 is not set # CONFIG_CRYPTO_TGR192 is not set # CONFIG_CRYPTO_WP512 is not set # # Ciphers # CONFIG_CRYPTO_AES=m # CONFIG_CRYPTO_AES_586 is not set # CONFIG_CRYPTO_ANUBIS is not set # CONFIG_CRYPTO_ARC4 is not set # CONFIG_CRYPTO_BLOWFISH is not set # CONFIG_CRYPTO_CAMELLIA is not set # CONFIG_CRYPTO_CAST5 is not set # CONFIG_CRYPTO_CAST6 is not set CONFIG_CRYPTO_DES=m # CONFIG_CRYPTO_FCRYPT is not set # CONFIG_CRYPTO_KHAZAD is not set # CONFIG_CRYPTO_SALSA20 is not set # CONFIG_CRYPTO_SALSA20_586 is not set # CONFIG_CRYPTO_SEED is not set # CONFIG_CRYPTO_SERPENT is not set # CONFIG_CRYPTO_TEA is not set # CONFIG_CRYPTO_TWOFISH is not set # CONFIG_CRYPTO_TWOFISH_586 is not set # # Compression # # CONFIG_CRYPTO_DEFLATE is not set # CONFIG_CRYPTO_ZLIB is not set # CONFIG_CRYPTO_LZO is not set # # Random Number Generation # # CONFIG_CRYPTO_ANSI_CPRNG is not set # CONFIG_CRYPTO_HW is not set CONFIG_HAVE_KVM=y CONFIG_HAVE_KVM_IRQCHIP=y # CONFIG_VIRTUALIZATION is not set CONFIG_BINARY_PRINTF=y # # Library routines # CONFIG_BITREVERSE=y CONFIG_GENERIC_FIND_FIRST_BIT=y CONFIG_GENERIC_FIND_NEXT_BIT=y CONFIG_GENERIC_FIND_LAST_BIT=y CONFIG_CRC_CCITT=m CONFIG_CRC16=m # CONFIG_CRC_T10DIF is not set CONFIG_CRC_ITU_T=y CONFIG_CRC32=y # CONFIG_CRC7 is not set # CONFIG_LIBCRC32C is not set CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m CONFIG_DECOMPRESS_GZIP=y CONFIG_DECOMPRESS_BZIP2=y CONFIG_DECOMPRESS_LZMA=y CONFIG_HAS_IOMEM=y CONFIG_HAS_IOPORT=y CONFIG_HAS_DMA=y CONFIG_NLATTR=y [-- Attachment #3: dmesg.txt --] [-- Type: text/plain, Size: 90538 bytes --] Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.30-rc4-io (root@localhost.localdomain) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #6 SMP PREEMPT Thu May 7 11:07:49 CST 2009 KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD NSC Geode by NSC Cyrix CyrixInstead Centaur CentaurHauls Transmeta GenuineTMx86 Transmeta TransmetaCPU UMC UMC UMC UMC BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009f400 (usable) BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003bff0000 (usable) BIOS-e820: 000000003bff0000 - 000000003bff3000 (ACPI NVS) BIOS-e820: 000000003bff3000 - 000000003c000000 (ACPI data) BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) DMI 2.3 present. Phoenix BIOS detected: BIOS may corrupt low RAM, working around it. e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved) last_pfn = 0x3bff0 max_arch_pfn = 0x100000 MTRR default type: uncachable MTRR fixed ranges enabled: 00000-9FFFF write-back A0000-BFFFF uncachable C0000-C7FFF write-protect C8000-FFFFF uncachable MTRR variable ranges enabled: 0 base 000000000 mask FC0000000 write-back 1 base 03C000000 mask FFC000000 uncachable 2 base 0D0000000 mask FF8000000 write-combining 3 disabled 4 disabled 5 disabled 6 disabled 7 disabled init_memory_mapping: 0000000000000000-00000000377fe000 0000000000 - 0000400000 page 4k 0000400000 - 0037400000 page 2M 0037400000 - 00377fe000 page 4k kernel direct mapping tables up to 377fe000 @ 10000-15000 RAMDISK: 37d0d000 - 37fefd69 Allocated new RAMDISK: 00100000 - 003e2d69 Move RAMDISK from 0000000037d0d000 - 0000000037fefd68 to 00100000 - 003e2d68 ACPI: RSDP 000f7560 00014 (v00 AWARD ) ACPI: RSDT 3bff3040 0002C (v01 AWARD AWRDACPI 42302E31 AWRD 00000000) ACPI: FACP 3bff30c0 00074 (v01 AWARD AWRDACPI 42302E31 AWRD 00000000) ACPI: DSDT 3bff3180 03ABC (v01 AWARD AWRDACPI 00001000 MSFT 0100000E) ACPI: FACS 3bff0000 00040 ACPI: APIC 3bff6c80 00084 (v01 AWARD AWRDACPI 42302E31 AWRD 00000000) ACPI: Local APIC address 0xfee00000 71MB HIGHMEM available. 887MB LOWMEM available. mapped low ram: 0 - 377fe000 low ram: 0 - 377fe000 node 0 low ram: 00000000 - 377fe000 node 0 bootmap 00011000 - 00017f00 (9 early reservations) ==> bootmem [0000000000 - 00377fe000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0000001000 - 0000002000] EX TRAMPOLINE ==> [0000001000 - 0000002000] #2 [0000006000 - 0000007000] TRAMPOLINE ==> [0000006000 - 0000007000] #3 [0000400000 - 0000c6bd1c] TEXT DATA BSS ==> [0000400000 - 0000c6bd1c] #4 [000009f400 - 0000100000] BIOS reserved ==> [000009f400 - 0000100000] #5 [0000c6c000 - 0000c700ed] BRK ==> [0000c6c000 - 0000c700ed] #6 [0000010000 - 0000011000] PGTABLE ==> [0000010000 - 0000011000] #7 [0000100000 - 00003e2d69] NEW RAMDISK ==> [0000100000 - 00003e2d69] #8 [0000011000 - 0000018000] BOOTMAP ==> [0000011000 - 0000018000] found SMP MP-table at [c00f5ad0] f5ad0 Zone PFN ranges: DMA 0x00000010 -> 0x00001000 Normal 0x00001000 -> 0x000377fe HighMem 0x000377fe -> 0x0003bff0 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000010 -> 0x0000009f 0: 0x00000100 -> 0x0003bff0 On node 0 totalpages: 245631 free_area_init_node: node 0, pgdat c0778f80, node_mem_map c1000340 DMA zone: 52 pages used for memmap DMA zone: 0 pages reserved DMA zone: 3931 pages, LIFO batch:0 Normal zone: 2834 pages used for memmap Normal zone: 220396 pages, LIFO batch:31 HighMem zone: 234 pages used for memmap HighMem zone: 18184 pages, LIFO batch:3 Using APIC driver default ACPI: PM-Timer IO Port: 0x1008 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled) ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1]) ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 4, version 17, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 dfl dfl) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information SMP: Allowing 4 CPUs, 2 hotplug CPUs nr_irqs_gsi: 24 Allocating PCI resources starting at 40000000 (gap: 3c000000:c2c00000) NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:4 nr_node_ids:1 PERCPU: Embedded 13 pages at c1c3b000, static data 32756 bytes Built 1 zonelists in Zone order, mobility grouping on. Total pages: 242511 Kernel command line: ro root=LABEL=/ rhgb quiet Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 Preemptible RCU implementation. NR_IRQS:512 CPU 0 irqstacks, hard=c1c3b000 soft=c1c3c000 PID hash table entries: 4096 (order: 12, 16384 bytes) Fast TSC calibration using PIT Detected 2800.222 MHz processor. Console: colour VGA+ 80x25 console [tty0] enabled Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar ... MAX_LOCKDEP_SUBCLASSES: 8 ... MAX_LOCK_DEPTH: 48 ... MAX_LOCKDEP_KEYS: 8191 ... CLASSHASH_SIZE: 4096 ... MAX_LOCKDEP_ENTRIES: 8192 ... MAX_LOCKDEP_CHAINS: 16384 ... CHAINHASH_SIZE: 8192 memory used by lock dependency info: 2847 kB per task-struct memory footprint: 1152 bytes Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) allocated 4914560 bytes of page_cgroup please try cgroup_disable=memory,blkio option if you don't want Initializing HighMem for node 0 (000377fe:0003bff0) Memory: 952284k/982976k available (2258k kernel code, 30016k reserved, 1424k data, 320k init, 73672k highmem) virtual kernel memory layout: fixmap : 0xffedf000 - 0xfffff000 (1152 kB) pkmap : 0xff800000 - 0xffc00000 (4096 kB) vmalloc : 0xf7ffe000 - 0xff7fe000 ( 120 MB) lowmem : 0xc0000000 - 0xf77fe000 ( 887 MB) .init : 0xc079d000 - 0xc07ed000 ( 320 kB) .data : 0xc06349ab - 0xc0798cb8 (1424 kB) .text : 0xc0400000 - 0xc06349ab (2258 kB) Checking if this processor honours the WP bit even in supervisor mode...Ok. SLUB: Genslabs=13, HWalign=128, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 Calibrating delay loop (skipped), value calculated using timer frequency.. 5600.44 BogoMIPS (lpj=2800222) Mount-cache hash table entries: 512 Initializing cgroup subsys debug Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys blkio Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls Initializing cgroup subsys io CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 1024K CPU: Physical Processor ID: 0 CPU: Processor Core ID: 0 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU0: Intel P4/Xeon Extended MCE MSRs (24) available using mwait in idle threads. Checking 'hlt' instruction... OK. ACPI: Core revision 20090320 ftrace: converting mcount calls to 0f 1f 44 00 00 ftrace: allocating 12136 entries in 24 pages ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 CPU0: Intel(R) Pentium(R) D CPU 2.80GHz stepping 04 lockdep: fixing up alternatives. CPU 1 irqstacks, hard=c1c4b000 soft=c1c4c000 Booting processor 1 APIC 0x1 ip 0x6000 Initializing CPU#1 Calibrating delay using timer specific routine.. 5599.23 BogoMIPS (lpj=2799617) CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 1024K CPU: Physical Processor ID: 0 CPU: Processor Core ID: 1 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel P4/Xeon Extended MCE MSRs (24) available CPU1: Intel(R) Pentium(R) D CPU 2.80GHz stepping 04 checking TSC synchronization [CPU#0 -> CPU#1]: passed. Brought up 2 CPUs Total of 2 processors activated (11199.67 BogoMIPS). CPU0 attaching sched-domain: domain 0: span 0-1 level CPU groups: 0 1 CPU1 attaching sched-domain: domain 0: span 0-1 level CPU groups: 1 0 net_namespace: 436 bytes NET: Registered protocol family 16 ACPI: bus type pci registered PCI: PCI BIOS revision 2.10 entry at 0xfbda0, last bus=1 PCI: Using configuration type 1 for base access mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs. mtrr: corrected configuration. bio: create slab <bio-0> at 0 ACPI: EC: Look up EC in DSDT ACPI: Interpreter enabled ACPI: (supports S0 S3 S5) ACPI: Using IOAPIC for interrupt routing ACPI: No dock devices found. ACPI: PCI Root Bridge [PCI0] (0000:00) pci 0000:00:00.0: reg 10 32bit mmio: [0xd0000000-0xd7ffffff] pci 0000:00:02.5: reg 10 io port: [0x1f0-0x1f7] pci 0000:00:02.5: reg 14 io port: [0x3f4-0x3f7] pci 0000:00:02.5: reg 18 io port: [0x170-0x177] pci 0000:00:02.5: reg 1c io port: [0x374-0x377] pci 0000:00:02.5: reg 20 io port: [0x4000-0x400f] pci 0000:00:02.5: PME# supported from D3cold pci 0000:00:02.5: PME# disabled pci 0000:00:02.7: reg 10 io port: [0xd000-0xd0ff] pci 0000:00:02.7: reg 14 io port: [0xd400-0xd47f] pci 0000:00:02.7: supports D1 D2 pci 0000:00:02.7: PME# supported from D3hot D3cold pci 0000:00:02.7: PME# disabled pci 0000:00:03.0: reg 10 32bit mmio: [0xe1104000-0xe1104fff] pci 0000:00:03.1: reg 10 32bit mmio: [0xe1100000-0xe1100fff] pci 0000:00:03.2: reg 10 32bit mmio: [0xe1101000-0xe1101fff] pci 0000:00:03.3: reg 10 32bit mmio: [0xe1102000-0xe1102fff] pci 0000:00:03.3: PME# supported from D0 D3hot D3cold pci 0000:00:03.3: PME# disabled pci 0000:00:05.0: reg 10 io port: [0xd800-0xd807] pci 0000:00:05.0: reg 14 io port: [0xdc00-0xdc03] pci 0000:00:05.0: reg 18 io port: [0xe000-0xe007] pci 0000:00:05.0: reg 1c io port: [0xe400-0xe403] pci 0000:00:05.0: reg 20 io port: [0xe800-0xe80f] pci 0000:00:05.0: PME# supported from D3cold pci 0000:00:05.0: PME# disabled pci 0000:00:0e.0: reg 10 io port: [0xec00-0xecff] pci 0000:00:0e.0: reg 14 32bit mmio: [0xe1103000-0xe11030ff] pci 0000:00:0e.0: reg 30 32bit mmio: [0x000000-0x01ffff] pci 0000:00:0e.0: supports D1 D2 pci 0000:00:0e.0: PME# supported from D1 D2 D3hot D3cold pci 0000:00:0e.0: PME# disabled pci 0000:01:00.0: reg 10 32bit mmio: [0xd8000000-0xdfffffff] pci 0000:01:00.0: reg 14 32bit mmio: [0xe1000000-0xe101ffff] pci 0000:01:00.0: reg 18 io port: [0xc000-0xc07f] pci 0000:01:00.0: supports D1 D2 pci 0000:00:01.0: bridge io port: [0xc000-0xcfff] pci 0000:00:01.0: bridge 32bit mmio: [0xe1000000-0xe10fffff] pci 0000:00:01.0: bridge 32bit mmio pref: [0xd8000000-0xdfffffff] pci_bus 0000:00: on NUMA node 0 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 *11 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 *10 11 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 *11 14 15) ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 *6 7 9 10 11 14 15) ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 *9 10 11 14 15) ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 *5 6 7 9 10 11 14 15) usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing pnp: PnP ACPI init ACPI: bus type pnp registered pnp: PnP ACPI: found 12 devices ACPI: ACPI bus type pnp unregistered system 00:00: iomem range 0xc8000-0xcbfff has been reserved system 00:00: iomem range 0xf0000-0xf7fff could not be reserved system 00:00: iomem range 0xf8000-0xfbfff could not be reserved system 00:00: iomem range 0xfc000-0xfffff could not be reserved system 00:00: iomem range 0x3bff0000-0x3bffffff could not be reserved system 00:00: iomem range 0xffff0000-0xffffffff has been reserved system 00:00: iomem range 0x0-0x9ffff could not be reserved system 00:00: iomem range 0x100000-0x3bfeffff could not be reserved system 00:00: iomem range 0xffee0000-0xffefffff has been reserved system 00:00: iomem range 0xfffe0000-0xfffeffff has been reserved system 00:00: iomem range 0xfec00000-0xfecfffff has been reserved system 00:00: iomem range 0xfee00000-0xfeefffff has been reserved system 00:02: ioport range 0x4d0-0x4d1 has been reserved system 00:02: ioport range 0x800-0x805 has been reserved system 00:02: ioport range 0x290-0x297 has been reserved system 00:02: ioport range 0x880-0x88f has been reserved pci 0000:00:01.0: PCI bridge, secondary bus 0000:01 pci 0000:00:01.0: IO window: 0xc000-0xcfff pci 0000:00:01.0: MEM window: 0xe1000000-0xe10fffff pci 0000:00:01.0: PREFETCH window: 0x000000d8000000-0x000000dfffffff pci_bus 0000:00: resource 0 io: [0x00-0xffff] pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffff] pci_bus 0000:01: resource 0 io: [0xc000-0xcfff] pci_bus 0000:01: resource 1 mem: [0xe1000000-0xe10fffff] pci_bus 0000:01: resource 2 pref mem [0xd8000000-0xdfffffff] NET: Registered protocol family 2 IP route cache hash table entries: 32768 (order: 5, 131072 bytes) TCP established hash table entries: 131072 (order: 8, 1048576 bytes) TCP bind hash table entries: 65536 (order: 9, 2097152 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered NET: Registered protocol family 1 checking if image is initramfs... rootfs image is initramfs; unpacking... Freeing initrd memory: 2955k freed apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac) apm: disabled - APM is not SMP safe. highmem bounce pool size: 64 pages HugeTLB registered 4 MB page size, pre-allocated 0 pages msgmni has been set to 1722 alg: No test for stdrng (krng) Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254) io scheduler noop registered io scheduler cfq registered (default) pci 0000:01:00.0: Boot video device pci_hotplug: PCI Hot Plug PCI Core version: 0.5 fan PNP0C0B:00: registered as cooling_device0 ACPI: Fan [FAN] (on) processor ACPI_CPU:00: registered as cooling_device1 processor ACPI_CPU:01: registered as cooling_device2 thermal LNXTHERM:01: registered as thermal_zone0 ACPI: Thermal Zone [THRM] (62 C) isapnp: Scanning for PnP cards... Switched to high resolution mode on CPU 1 Switched to high resolution mode on CPU 0 isapnp: No Plug & Play device found Real Time Clock Driver v1.12b Non-volatile memory driver v1.3 Linux agpgart interface v0.103 agpgart-sis 0000:00:00.0: SiS chipset [1039/0661] agpgart-sis 0000:00:00.0: AGP aperture is 128M @ 0xd0000000 Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A 00:08: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A brd: module loaded PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12 serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice cpuidle: using governor ladder cpuidle: using governor menu TCP cubic registered NET: Registered protocol family 17 Using IPI No-Shortcut mode registered taskstats version 1 Freeing unused kernel memory: 320k freed Write protecting the kernel text: 2260k Write protecting the kernel read-only data: 1120k ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ehci_hcd 0000:00:03.3: PCI INT D -> GSI 23 (level, low) -> IRQ 23 ehci_hcd 0000:00:03.3: EHCI Host Controller ehci_hcd 0000:00:03.3: new USB bus registered, assigned bus number 1 ehci_hcd 0000:00:03.3: cache line size of 128 is not supported ehci_hcd 0000:00:03.3: irq 23, io mem 0xe1102000 ehci_hcd 0000:00:03.3: USB 2.0 started, EHCI 1.00 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 8 ports detected ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver ohci_hcd 0000:00:03.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20 ohci_hcd 0000:00:03.0: OHCI Host Controller ohci_hcd 0000:00:03.0: new USB bus registered, assigned bus number 2 ohci_hcd 0000:00:03.0: irq 20, io mem 0xe1104000 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 3 ports detected ohci_hcd 0000:00:03.1: PCI INT B -> GSI 21 (level, low) -> IRQ 21 ohci_hcd 0000:00:03.1: OHCI Host Controller ohci_hcd 0000:00:03.1: new USB bus registered, assigned bus number 3 ohci_hcd 0000:00:03.1: irq 21, io mem 0xe1100000 usb usb3: configuration #1 chosen from 1 choice hub 3-0:1.0: USB hub found hub 3-0:1.0: 3 ports detected ohci_hcd 0000:00:03.2: PCI INT C -> GSI 22 (level, low) -> IRQ 22 ohci_hcd 0000:00:03.2: OHCI Host Controller ohci_hcd 0000:00:03.2: new USB bus registered, assigned bus number 4 ohci_hcd 0000:00:03.2: irq 22, io mem 0xe1101000 usb usb4: configuration #1 chosen from 1 choice hub 4-0:1.0: USB hub found hub 4-0:1.0: 2 ports detected uhci_hcd: USB Universal Host Controller Interface driver SCSI subsystem initialized Driver 'sd' needs updating - please use bus_type methods libata version 3.00 loaded. pata_sis 0000:00:02.5: version 0.5.2 pata_sis 0000:00:02.5: PCI INT A -> GSI 16 (level, low) -> IRQ 16 scsi0 : pata_sis scsi1 : pata_sis ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0x4000 irq 14 ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0x4008 irq 15 input: ImPS/2 Logitech Wheel Mouse as /class/input/input0 input: AT Translated Set 2 keyboard as /class/input/input1 sata_sis 0000:00:05.0: version 1.0 sata_sis 0000:00:05.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 sata_sis 0000:00:05.0: Detected SiS 180/181/964 chipset in SATA mode scsi2 : sata_sis scsi3 : sata_sis ata3: SATA max UDMA/133 cmd 0xd800 ctl 0xdc00 bmdma 0xe800 irq 17 ata4: SATA max UDMA/133 cmd 0xe000 ctl 0xe400 bmdma 0xe808 irq 17 ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata3.00: ATA-7: ST3808110AS, 3.AAE, max UDMA/133 ata3.00: 156301488 sectors, multi 16: LBA48 NCQ (depth 0/32) ata3.00: configured for UDMA/133 scsi 2:0:0:0: Direct-Access ATA ST3808110AS 3.AA PQ: 0 ANSI: 5 sd 2:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB) sd 2:0:0:0: [sda] Write Protect is off sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: sda1 sda2 < sda5 sda6 sda7 sda8 sda9 > sd 2:0:0:0: [sda] Attached SCSI disk ata4: SATA link down (SStatus 0 SControl 300) EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: sda8: orphan cleanup on readonly fs ext3_orphan_cleanup: deleting unreferenced inode 3725366 ext3_orphan_cleanup: deleting unreferenced inode 3725365 ext3_orphan_cleanup: deleting unreferenced inode 3725364 EXT3-fs: sda8: 3 orphan inodes deleted EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with writeback data mode. r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded r8169 0000:00:0e.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 r8169 0000:00:0e.0: no PCI Express capability eth0: RTL8110s at 0xf8236000, 00:16:ec:2e:b7:e0, XID 04000000 IRQ 18 sd 2:0:0:0: Attached scsi generic sg0 type 0 parport_pc 00:09: reported by Plug and Play ACPI parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE] input: Power Button as /class/input/input2 ACPI: Power Button [PWRF] input: Power Button as /class/input/input3 ACPI: Power Button [PWRB] input: Sleep Button as /class/input/input4 ACPI: Sleep Button [FUTS] ramfs: bad mount option: maxsize=512 EXT3 FS on sda8, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on sda7, internal journal EXT3-fs: mounted filesystem with writeback data mode. Adding 1052216k swap on /dev/sda6. Priority:-1 extents:1 across:1052216k warning: process `kudzu' used the deprecated sysctl system call with 1.23. kudzu[1133] general protection ip:8056968 sp:bffe9e90 error:0 r8169: eth0: link up r8169: eth0: link up warning: `dbus-daemon' uses 32-bit capabilities (legacy support in use) CPU0 attaching NULL sched-domain. CPU1 attaching NULL sched-domain. CPU0 attaching sched-domain: domain 0: span 0-1 level CPU groups: 0 1 CPU1 attaching sched-domain: domain 0: span 0-1 level CPU groups: 1 0 ========================================================= [ INFO: possible irq lock inversion dependency detected ] 2.6.30-rc4-io #6 --------------------------------------------------------- rmdir/2186 just changed the state of lock: (&iocg->lock){+.+...}, at: [<c0513b18>] iocg_destroy+0x2a/0x118 but this lock was taken by another, SOFTIRQ-safe lock in the past: (&q->__queue_lock){..-...} and interrupts could create inverse lock ordering between them. other info that might help us debug this: 3 locks held by rmdir/2186: #0: (&sb->s_type->i_mutex_key#10/1){+.+.+.}, at: [<c04ae1e8>] do_rmdir+0x5c/0xc8 #1: (cgroup_mutex){+.+.+.}, at: [<c045a15b>] cgroup_diput+0x3c/0xa7 #2: (&iocg->lock){+.+...}, at: [<c0513b18>] iocg_destroy+0x2a/0x118 the first lock's dependencies: -> (&iocg->lock){+.+...} ops: 3 { HARDIRQ-ON-W at: [<c044b840>] mark_held_locks+0x3d/0x58 [<c044b963>] trace_hardirqs_on_caller+0x108/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff SOFTIRQ-ON-W at: [<c044b840>] mark_held_locks+0x3d/0x58 [<c044b97c>] trace_hardirqs_on_caller+0x121/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c06304ea>] _spin_lock_irq+0x30/0x3f [<c05119bd>] io_alloc_root_group+0x104/0x155 [<c05133cb>] elv_init_fq_data+0x32/0xe0 [<c0504317>] elevator_alloc+0x150/0x170 [<c0505393>] elevator_init+0x9d/0x100 [<c0507088>] blk_init_queue_node+0xc4/0xf7 [<c05070cb>] blk_init_queue+0x10/0x12 [<f81060fd>] __scsi_alloc_queue+0x1c/0xba [scsi_mod] [<f81061b0>] scsi_alloc_queue+0x15/0x4e [scsi_mod] [<f810803d>] scsi_alloc_sdev+0x154/0x1f5 [scsi_mod] [<f8108387>] scsi_probe_and_add_lun+0x123/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c5ebd8>] __key.29462+0x0/0x8 the second lock's dependencies: -> (&q->__queue_lock){..-...} ops: 162810 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<f810672c>] scsi_device_unbusy+0x78/0x92 [scsi_mod] [<f8101483>] scsi_finish_command+0x22/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<f8101337>] scsi_adjust_queue_depth+0x2a/0xc9 [scsi_mod] [<f8108079>] scsi_alloc_sdev+0x190/0x1f5 [scsi_mod] [<f8108387>] scsi_probe_and_add_lun+0x123/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c5e698>] __key.29749+0x0/0x8 -> (&ioc->lock){..-...} ops: 1032 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c050f0f0>] cic_free_func+0x26/0x64 [<c050ea90>] __call_for_each_cic+0x23/0x2e [<c050eaad>] cfq_free_io_context+0x12/0x14 [<c050978c>] put_io_context+0x4b/0x66 [<c050f2a2>] cfq_put_request+0x42/0x5b [<c0504629>] elv_put_request+0x30/0x33 [<c050678d>] __blk_put_request+0x8b/0xb8 [<c0506953>] end_that_request_last+0x199/0x1a1 [<c0506a0d>] blk_end_io+0x51/0x6f [<c0506a64>] blk_end_request+0x11/0x13 [<f8106c9c>] scsi_io_completion+0x1d9/0x41f [scsi_mod] [<f810152d>] scsi_finish_command+0xcc/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c050f9bf>] cfq_set_request+0x123/0x33d [<c05052e6>] elv_set_request+0x43/0x53 [<c0506d44>] get_request+0x22e/0x33f [<c0507498>] get_request_wait+0x137/0x15d [<c0507501>] blk_get_request+0x43/0x73 [<f8106854>] scsi_execute+0x24/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c5e6ec>] __key.27747+0x0/0x8 -> (&rdp->lock){-.-...} ops: 168014 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0461b2a>] rcu_check_callbacks+0x6a/0xa3 [<c043549a>] update_process_times+0x3d/0x53 [<c0447fe0>] tick_periodic+0x6b/0x77 [<c0448009>] tick_handle_periodic+0x1d/0x60 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c042fbd7>] do_exit+0x53e/0x5b3 [<c043a9d8>] __request_module+0x0/0x100 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c04619db>] rcu_process_callbacks+0x2b/0x86 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c062c8ca>] rcu_online_cpu+0x3d/0x51 [<c062c910>] rcu_cpu_notify+0x32/0x43 [<c07b097f>] __rcu_init+0xf0/0x120 [<c07af027>] rcu_init+0x8/0x14 [<c079d6e1>] start_kernel+0x187/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0c2e52c>] __key.17543+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c046143d>] call_rcu+0x36/0x5b [<c0517b45>] radix_tree_delete+0xe7/0x176 [<c050f0fe>] cic_free_func+0x34/0x64 [<c050ea90>] __call_for_each_cic+0x23/0x2e [<c050eaad>] cfq_free_io_context+0x12/0x14 [<c050978c>] put_io_context+0x4b/0x66 [<c050984c>] exit_io_context+0x77/0x7b [<c042fc24>] do_exit+0x58b/0x5b3 [<c04034ed>] kernel_thread_helper+0xd/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c050f4a3>] cfq_cic_lookup+0xd9/0xef [<c050f674>] cfq_get_queue+0x92/0x2ba [<c050fb01>] cfq_set_request+0x265/0x33d [<c05052e6>] elv_set_request+0x43/0x53 [<c0506d44>] get_request+0x22e/0x33f [<c0507498>] get_request_wait+0x137/0x15d [<c0507501>] blk_get_request+0x43/0x73 [<f8106854>] scsi_execute+0x24/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&base->lock){..-...} ops: 348073 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c06304ea>] _spin_lock_irq+0x30/0x3f [<c0434b8b>] run_timer_softirq+0x3c/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0434e84>] lock_timer_base+0x24/0x43 [<c0434f3d>] mod_timer+0x46/0xcc [<c07bd97a>] con_init+0xa4/0x20e [<c07bd3b2>] console_init+0x12/0x20 [<c079d735>] start_kernel+0x1db/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c082304c>] __key.23401+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0434e84>] lock_timer_base+0x24/0x43 [<c0434f3d>] mod_timer+0x46/0xcc [<c05075cb>] blk_plug_device+0x9a/0xdf [<c05049e1>] __elv_add_request+0x86/0x96 [<c0509d52>] blk_execute_rq_nowait+0x5d/0x86 [<c0509e2e>] blk_execute_rq+0xb3/0xd5 [<f81068f5>] scsi_execute+0xc5/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&sdev->list_lock){..-...} ops: 27612 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<f8101cb4>] scsi_put_command+0x17/0x57 [scsi_mod] [<f810620f>] scsi_next_command+0x26/0x39 [scsi_mod] [<f8106d02>] scsi_io_completion+0x23f/0x41f [scsi_mod] [<f810152d>] scsi_finish_command+0xcc/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<f8101c64>] scsi_get_command+0x5c/0x95 [scsi_mod] [<f81062b6>] scsi_get_cmd_from_req+0x26/0x50 [scsi_mod] [<f8106594>] scsi_setup_blk_pc_cmnd+0x2b/0xd7 [scsi_mod] [<f8106664>] scsi_prep_fn+0x24/0x33 [scsi_mod] [<c0504712>] elv_next_request+0xe6/0x18d [<f810704c>] scsi_request_fn+0x69/0x431 [scsi_mod] [<c05072af>] __generic_unplug_device+0x2e/0x31 [<c0509d59>] blk_execute_rq_nowait+0x64/0x86 [<c0509e2e>] blk_execute_rq+0xb3/0xd5 [<f81068f5>] scsi_execute+0xc5/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<f811916c>] __key.29786+0x0/0xffff2ebf [scsi_mod] ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<f8101c64>] scsi_get_command+0x5c/0x95 [scsi_mod] [<f81062b6>] scsi_get_cmd_from_req+0x26/0x50 [scsi_mod] [<f8106594>] scsi_setup_blk_pc_cmnd+0x2b/0xd7 [scsi_mod] [<f8106664>] scsi_prep_fn+0x24/0x33 [scsi_mod] [<c0504712>] elv_next_request+0xe6/0x18d [<f810704c>] scsi_request_fn+0x69/0x431 [scsi_mod] [<c05072af>] __generic_unplug_device+0x2e/0x31 [<c0509d59>] blk_execute_rq_nowait+0x64/0x86 [<c0509e2e>] blk_execute_rq+0xb3/0xd5 [<f81068f5>] scsi_execute+0xc5/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&q->lock){-.-.-.} ops: 2105038 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c041ec0d>] complete+0x17/0x43 [<c062609b>] i8042_aux_test_irq+0x4c/0x65 [<c045e922>] handle_IRQ_event+0xa4/0x169 [<c04602ea>] handle_edge_irq+0xc9/0x10a [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c041ec0d>] complete+0x17/0x43 [<c043c336>] wakeme_after_rcu+0x10/0x12 [<c0461a12>] rcu_process_callbacks+0x62/0x86 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff IN-RECLAIM_FS-W at: [<c044dabd>] __lock_acquire+0x574/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043e47b>] prepare_to_wait+0x1c/0x4a [<c0485d3e>] kswapd+0xa7/0x51b [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c06304ea>] _spin_lock_irq+0x30/0x3f [<c062d811>] wait_for_common+0x2f/0xeb [<c062d968>] wait_for_completion+0x17/0x19 [<c043e161>] kthread_create+0x6e/0xc7 [<c062b7eb>] migration_call+0x39/0x444 [<c07ae112>] migration_init+0x1d/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0823490>] __key.17681+0x0/0x8 -> (&rq->lock){-.-.-.} ops: 854341 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0429f89>] scheduler_tick+0x39/0x19b [<c04354a4>] update_process_times+0x47/0x53 [<c0447fe0>] tick_periodic+0x6b/0x77 [<c0448009>] tick_handle_periodic+0x1d/0x60 [<c0404ace>] timer_interrupt+0x3e/0x45 [<c045e922>] handle_IRQ_event+0xa4/0x169 [<c04603a3>] handle_level_irq+0x78/0xc1 [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ede7>] task_rq_lock+0x3b/0x62 [<c0426e41>] try_to_wake_up+0x75/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c043507c>] process_timeout+0xd/0xf [<c0434caa>] run_timer_softirq+0x15b/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff IN-RECLAIM_FS-W at: [<c044dabd>] __lock_acquire+0x574/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ede7>] task_rq_lock+0x3b/0x62 [<c0427515>] set_cpus_allowed_ptr+0x1a/0xdd [<c0485cf8>] kswapd+0x61/0x51b [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c042398e>] rq_attach_root+0x17/0xa7 [<c07ae52c>] sched_init+0x240/0x33e [<c079d661>] start_kernel+0x107/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0800518>] __key.46938+0x0/0x8 -> (&vec->lock){-.-...} ops: 34058 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad3b>] cpupri_set+0x51/0xba [<c04219ee>] __enqueue_rt_entity+0xe2/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c04408b6>] hrtimer_wakeup+0x1d/0x21 [<c0440922>] __run_hrtimer+0x68/0x98 [<c04411ca>] hrtimer_interrupt+0x101/0x153 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c0401c4f>] cpu_idle+0x53/0x85 [<c061fc80>] rest_init+0x6c/0x6e [<c079d851>] start_kernel+0x2f7/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad3b>] cpupri_set+0x51/0xba [<c04219ee>] __enqueue_rt_entity+0xe2/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c042737c>] rebalance_domains+0x2a3/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad74>] cpupri_set+0x8a/0xba [<c04216f2>] rq_online_rt+0x5e/0x61 [<c041dd3a>] set_rq_online+0x40/0x4a [<c04239fb>] rq_attach_root+0x84/0xa7 [<c07ae52c>] sched_init+0x240/0x33e [<c079d661>] start_kernel+0x107/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0c525d0>] __key.14261+0x0/0x10 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad74>] cpupri_set+0x8a/0xba [<c04216f2>] rq_online_rt+0x5e/0x61 [<c041dd3a>] set_rq_online+0x40/0x4a [<c04239fb>] rq_attach_root+0x84/0xa7 [<c07ae52c>] sched_init+0x240/0x33e [<c079d661>] start_kernel+0x107/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff -> (&rt_b->rt_runtime_lock){-.-...} ops: 336 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421a75>] __enqueue_rt_entity+0x169/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c04408b6>] hrtimer_wakeup+0x1d/0x21 [<c0440922>] __run_hrtimer+0x68/0x98 [<c04411ca>] hrtimer_interrupt+0x101/0x153 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c0401c4f>] cpu_idle+0x53/0x85 [<c061fc80>] rest_init+0x6c/0x6e [<c079d851>] start_kernel+0x2f7/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421a75>] __enqueue_rt_entity+0x169/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c042737c>] rebalance_domains+0x2a3/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421a75>] __enqueue_rt_entity+0x169/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c062b86b>] migration_call+0xb9/0x444 [<c07ae130>] migration_init+0x3b/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0800504>] __key.37924+0x0/0x8 -> (&cpu_base->lock){-.-...} ops: 950512 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0440a3a>] hrtimer_run_queues+0xe8/0x131 [<c0435151>] run_local_timers+0xd/0x1e [<c0435486>] update_process_times+0x29/0x53 [<c0447fe0>] tick_periodic+0x6b/0x77 [<c0448009>] tick_handle_periodic+0x1d/0x60 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c04082c7>] arch_dup_task_struct+0x19/0x81 [<c042ac1c>] copy_process+0xab/0x115f [<c042be78>] do_fork+0x129/0x2c5 [<c0401698>] kernel_thread+0x7f/0x87 [<c043e0b3>] kthreadd+0xa3/0xe3 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0440b98>] lock_hrtimer_base+0x1d/0x38 [<c0440ca9>] __hrtimer_start_range_ns+0x1f/0x232 [<c0440ee7>] hrtimer_start_range_ns+0x15/0x17 [<c0448ef1>] tick_setup_sched_timer+0xf6/0x124 [<c0441558>] hrtimer_run_pending+0xb0/0xe8 [<c0434b76>] run_timer_softirq+0x27/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0440b98>] lock_hrtimer_base+0x1d/0x38 [<c0440ca9>] __hrtimer_start_range_ns+0x1f/0x232 [<c0421ab1>] __enqueue_rt_entity+0x1a5/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c062b86b>] migration_call+0xb9/0x444 [<c07ae130>] migration_init+0x3b/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c08234b8>] __key.20063+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0440b98>] lock_hrtimer_base+0x1d/0x38 [<c0440ca9>] __hrtimer_start_range_ns+0x1f/0x232 [<c0421ab1>] __enqueue_rt_entity+0x1a5/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c062b86b>] migration_call+0xb9/0x444 [<c07ae130>] migration_init+0x3b/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&rt_rq->rt_runtime_lock){-.....} ops: 17587 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421efc>] sched_rt_period_timer+0xda/0x24e [<c0440922>] __run_hrtimer+0x68/0x98 [<c04411ca>] hrtimer_interrupt+0x101/0x153 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c0452203>] each_symbol_in_section+0x27/0x57 [<c045225a>] each_symbol+0x27/0x113 [<c0452373>] find_symbol+0x2d/0x51 [<c0454a7a>] load_module+0xaec/0x10eb [<c04550bf>] sys_init_module+0x46/0x19b [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421c41>] update_curr_rt+0x13a/0x20d [<c0421dd8>] dequeue_task_rt+0x13/0x3a [<c041df9e>] dequeue_task+0xff/0x10e [<c041dfd1>] deactivate_task+0x24/0x2a [<c062db54>] __schedule+0x162/0x991 [<c062e39a>] schedule+0x17/0x30 [<c0426c54>] migration_thread+0x175/0x203 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c080050c>] __key.46863+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ee73>] __enable_runtime+0x43/0xb3 [<c04216d8>] rq_online_rt+0x44/0x61 [<c041dd3a>] set_rq_online+0x40/0x4a [<c062b8a5>] migration_call+0xf3/0x444 [<c063291c>] notifier_call_chain+0x2b/0x4a [<c0441e22>] __raw_notifier_call_chain+0x13/0x15 [<c0441e35>] raw_notifier_call_chain+0x11/0x13 [<c062bd2f>] _cpu_up+0xc3/0xf6 [<c062bdac>] cpu_up+0x4a/0x5a [<c079d49a>] kernel_init+0x9a/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421a75>] __enqueue_rt_entity+0x169/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c062b86b>] migration_call+0xb9/0x444 [<c07ae130>] migration_init+0x3b/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421c41>] update_curr_rt+0x13a/0x20d [<c0421dd8>] dequeue_task_rt+0x13/0x3a [<c041df9e>] dequeue_task+0xff/0x10e [<c041dfd1>] deactivate_task+0x24/0x2a [<c062db54>] __schedule+0x162/0x991 [<c062e39a>] schedule+0x17/0x30 [<c0426c54>] migration_thread+0x175/0x203 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&sig->cputimer.lock){......} ops: 1949 { INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043f03e>] thread_group_cputimer+0x29/0x90 [<c044004c>] posix_cpu_timers_exit_group+0x16/0x39 [<c042e5f0>] release_task+0xa2/0x376 [<c042fbe1>] do_exit+0x548/0x5b3 [<c043a9d8>] __request_module+0x0/0x100 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c08014ac>] __key.15480+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041f43a>] update_curr+0xef/0x107 [<c042131b>] enqueue_entity+0x1a/0x1c6 [<c0421535>] enqueue_task_fair+0x24/0x3e [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270b0>] default_wake_function+0x10/0x12 [<c041d785>] __wake_up_common+0x34/0x5f [<c041ec26>] complete+0x30/0x43 [<c043e1e8>] kthread+0x2e/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&rq->lock/1){..-...} ops: 3217 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630305>] _spin_lock_nested+0x2d/0x3e [<c0422cb4>] double_rq_lock+0x4b/0x7d [<c0427274>] rebalance_domains+0x19b/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630305>] _spin_lock_nested+0x2d/0x3e [<c0422cb4>] double_rq_lock+0x4b/0x7d [<c0427274>] rebalance_domains+0x19b/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff } ... key at: [<c0800519>] __key.46938+0x1/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421c41>] update_curr_rt+0x13a/0x20d [<c0421dd8>] dequeue_task_rt+0x13/0x3a [<c041df9e>] dequeue_task+0xff/0x10e [<c041dfd1>] deactivate_task+0x24/0x2a [<c0427b1b>] push_rt_task+0x189/0x1f7 [<c0427b9b>] push_rt_tasks+0x12/0x19 [<c0427bb9>] post_schedule_rt+0x17/0x21 [<c0425a68>] finish_task_switch+0x83/0xc0 [<c062e339>] __schedule+0x947/0x991 [<c062e39a>] schedule+0x17/0x30 [<c0426c54>] migration_thread+0x175/0x203 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad3b>] cpupri_set+0x51/0xba [<c04219ee>] __enqueue_rt_entity+0xe2/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0427b33>] push_rt_task+0x1a1/0x1f7 [<c0427b9b>] push_rt_tasks+0x12/0x19 [<c0427bb9>] post_schedule_rt+0x17/0x21 [<c0425a68>] finish_task_switch+0x83/0xc0 [<c062e339>] __schedule+0x947/0x991 [<c062e39a>] schedule+0x17/0x30 [<c0426c54>] migration_thread+0x175/0x203 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630305>] _spin_lock_nested+0x2d/0x3e [<c0422cb4>] double_rq_lock+0x4b/0x7d [<c0427274>] rebalance_domains+0x19b/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ede7>] task_rq_lock+0x3b/0x62 [<c0426e41>] try_to_wake_up+0x75/0x2d4 [<c04270b0>] default_wake_function+0x10/0x12 [<c041d785>] __wake_up_common+0x34/0x5f [<c041ec26>] complete+0x30/0x43 [<c043e0cc>] kthreadd+0xbc/0xe3 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&ep->lock){......} ops: 110 { INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c04ca381>] sys_epoll_ctl+0x232/0x3f6 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff } ... key at: [<c0c5be90>] __key.22301+0x0/0x10 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ede7>] task_rq_lock+0x3b/0x62 [<c0426e41>] try_to_wake_up+0x75/0x2d4 [<c04270b0>] default_wake_function+0x10/0x12 [<c041d785>] __wake_up_common+0x34/0x5f [<c041d7c6>] __wake_up_locked+0x16/0x1a [<c04ca7f5>] ep_poll_callback+0x7c/0xb6 [<c041d785>] __wake_up_common+0x34/0x5f [<c041ec70>] __wake_up_sync_key+0x37/0x4a [<c05cbefa>] sock_def_readable+0x42/0x71 [<c061c8b1>] unix_stream_connect+0x2f3/0x368 [<c05c830a>] sys_connect+0x59/0x76 [<c05c963f>] sys_socketcall+0x76/0x172 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c04ca797>] ep_poll_callback+0x1e/0xb6 [<c041d785>] __wake_up_common+0x34/0x5f [<c041ec70>] __wake_up_sync_key+0x37/0x4a [<c05cbefa>] sock_def_readable+0x42/0x71 [<c061c8b1>] unix_stream_connect+0x2f3/0x368 [<c05c830a>] sys_connect+0x59/0x76 [<c05c963f>] sys_socketcall+0x76/0x172 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c041ec0d>] complete+0x17/0x43 [<c0509cf2>] blk_end_sync_rq+0x2a/0x2d [<c0506935>] end_that_request_last+0x17b/0x1a1 [<c0506a0d>] blk_end_io+0x51/0x6f [<c0506a64>] blk_end_request+0x11/0x13 [<f8106c9c>] scsi_io_completion+0x1d9/0x41f [scsi_mod] [<f810152d>] scsi_finish_command+0xcc/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff -> (&n->list_lock){..-...} ops: 49241 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c049bd18>] add_partial+0x16/0x40 [<c049d0d4>] __slab_free+0x96/0x28f [<c049df5c>] kmem_cache_free+0x8c/0xf2 [<c04a5ce9>] file_free_rcu+0x35/0x38 [<c0461a12>] rcu_process_callbacks+0x62/0x86 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c049bd18>] add_partial+0x16/0x40 [<c049d0d4>] __slab_free+0x96/0x28f [<c049df5c>] kmem_cache_free+0x8c/0xf2 [<c0514eda>] ida_get_new_above+0x13b/0x155 [<c0514f00>] ida_get_new+0xc/0xe [<c04a628b>] set_anon_super+0x39/0xa3 [<c04a68c6>] sget+0x2f3/0x386 [<c04a7365>] get_sb_single+0x24/0x8f [<c04e034c>] sysfs_get_sb+0x18/0x1a [<c04a6dd1>] vfs_kern_mount+0x40/0x7b [<c04a6e21>] kern_mount_data+0x15/0x17 [<c07b5ff6>] sysfs_init+0x50/0x9c [<c07b4ac9>] mnt_init+0x8c/0x1e4 [<c07b4737>] vfs_caches_init+0xd8/0xea [<c079d815>] start_kernel+0x2bb/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0c5a424>] __key.25358+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c049cc45>] __slab_alloc+0xf6/0x4ef [<c049d333>] kmem_cache_alloc+0x66/0x11f [<f810189b>] scsi_pool_alloc_command+0x20/0x4c [scsi_mod] [<f81018de>] scsi_host_alloc_command+0x17/0x4f [scsi_mod] [<f810192b>] __scsi_get_command+0x15/0x71 [scsi_mod] [<f8101c41>] scsi_get_command+0x39/0x95 [scsi_mod] [<f81062b6>] scsi_get_cmd_from_req+0x26/0x50 [scsi_mod] [<f8106594>] scsi_setup_blk_pc_cmnd+0x2b/0xd7 [scsi_mod] [<f8106664>] scsi_prep_fn+0x24/0x33 [scsi_mod] [<c0504712>] elv_next_request+0xe6/0x18d [<f810704c>] scsi_request_fn+0x69/0x431 [scsi_mod] [<c05072af>] __generic_unplug_device+0x2e/0x31 [<c0509d59>] blk_execute_rq_nowait+0x64/0x86 [<c0509e2e>] blk_execute_rq+0xb3/0xd5 [<f81068f5>] scsi_execute+0xc5/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f812b40d>] sd_revalidate_disk+0x1a3/0xf64 [sd_mod] [<f812d52f>] sd_probe_async+0x146/0x22d [sd_mod] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&cwq->lock){-.-...} ops: 30335 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043b54b>] __queue_work+0x14/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043b64f>] schedule_work+0x14/0x16 [<c057a367>] schedule_console_callback+0x12/0x14 [<c05788ed>] kbd_event+0x595/0x600 [<c05b3d15>] input_pass_event+0x56/0x7e [<c05b4702>] input_handle_event+0x314/0x334 [<c05b4f1e>] input_event+0x50/0x63 [<c05b9bd4>] atkbd_interrupt+0x209/0x4e9 [<c05b1793>] serio_interrupt+0x38/0x6e [<c05b24e8>] i8042_interrupt+0x1db/0x1ec [<c045e922>] handle_IRQ_event+0xa4/0x169 [<c04602ea>] handle_edge_irq+0xc9/0x10a [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043b54b>] __queue_work+0x14/0x30 [<c043b590>] delayed_work_timer_fn+0x29/0x2d [<c0434caa>] run_timer_softirq+0x15b/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043b54b>] __queue_work+0x14/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043a7b3>] call_usermodehelper_exec+0x83/0xd0 [<c051631a>] kobject_uevent_env+0x351/0x385 [<c0516358>] kobject_uevent+0xa/0xc [<c0515a0e>] kset_register+0x2e/0x34 [<c0590f18>] bus_register+0xed/0x23d [<c07bea09>] platform_bus_init+0x23/0x38 [<c07beb77>] driver_init+0x1c/0x28 [<c079d4f6>] kernel_init+0xf6/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c08230a8>] __key.23814+0x0/0x8 -> (&workqueue_cpu_stat(cpu)->lock){-.-...} ops: 20397 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0474909>] probe_workqueue_insertion+0x33/0x81 [<c043acf3>] insert_work+0x3f/0x9b [<c043b559>] __queue_work+0x22/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043b64f>] schedule_work+0x14/0x16 [<c057a367>] schedule_console_callback+0x12/0x14 [<c05788ed>] kbd_event+0x595/0x600 [<c05b3d15>] input_pass_event+0x56/0x7e [<c05b4702>] input_handle_event+0x314/0x334 [<c05b4f1e>] input_event+0x50/0x63 [<c05b9bd4>] atkbd_interrupt+0x209/0x4e9 [<c05b1793>] serio_interrupt+0x38/0x6e [<c05b24e8>] i8042_interrupt+0x1db/0x1ec [<c045e922>] handle_IRQ_event+0xa4/0x169 [<c04602ea>] handle_edge_irq+0xc9/0x10a [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0474909>] probe_workqueue_insertion+0x33/0x81 [<c043acf3>] insert_work+0x3f/0x9b [<c043b559>] __queue_work+0x22/0x30 [<c043b590>] delayed_work_timer_fn+0x29/0x2d [<c0434caa>] run_timer_softirq+0x15b/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c04747eb>] probe_workqueue_creation+0xc9/0x10a [<c043abcb>] create_workqueue_thread+0x87/0xb0 [<c043b12f>] __create_workqueue_key+0x16d/0x1b2 [<c07aeedb>] init_workqueues+0x61/0x73 [<c079d4e7>] kernel_init+0xe7/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c52574>] __key.23424+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0474909>] probe_workqueue_insertion+0x33/0x81 [<c043acf3>] insert_work+0x3f/0x9b [<c043b559>] __queue_work+0x22/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043a7b3>] call_usermodehelper_exec+0x83/0xd0 [<c051631a>] kobject_uevent_env+0x351/0x385 [<c0516358>] kobject_uevent+0xa/0xc [<c0515a0e>] kset_register+0x2e/0x34 [<c0590f18>] bus_register+0xed/0x23d [<c07bea09>] platform_bus_init+0x23/0x38 [<c07beb77>] driver_init+0x1c/0x28 [<c079d4f6>] kernel_init+0xf6/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c041ecaf>] __wake_up+0x1a/0x40 [<c043ad46>] insert_work+0x92/0x9b [<c043b559>] __queue_work+0x22/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043a7b3>] call_usermodehelper_exec+0x83/0xd0 [<c051631a>] kobject_uevent_env+0x351/0x385 [<c0516358>] kobject_uevent+0xa/0xc [<c0515a0e>] kset_register+0x2e/0x34 [<c0590f18>] bus_register+0xed/0x23d [<c07bea09>] platform_bus_init+0x23/0x38 [<c07beb77>] driver_init+0x1c/0x28 [<c079d4f6>] kernel_init+0xf6/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043b54b>] __queue_work+0x14/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c0505679>] kblockd_schedule_work+0x12/0x14 [<c05113bb>] elv_schedule_dispatch+0x41/0x48 [<c0513377>] elv_ioq_completed_request+0x2dc/0x2fe [<c05045aa>] elv_completed_request+0x48/0x97 [<c0506738>] __blk_put_request+0x36/0xb8 [<c0506953>] end_that_request_last+0x199/0x1a1 [<c0506a0d>] blk_end_io+0x51/0x6f [<c0506a64>] blk_end_request+0x11/0x13 [<f8106c9c>] scsi_io_completion+0x1d9/0x41f [scsi_mod] [<f810152d>] scsi_finish_command+0xcc/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff -> (&zone->lock){..-...} ops: 80266 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c047fc71>] __free_pages_ok+0x167/0x321 [<c04800ce>] __free_pages+0x29/0x2b [<c049c7c1>] __free_slab+0xb2/0xba [<c049c800>] discard_slab+0x37/0x39 [<c049d15c>] __slab_free+0x11e/0x28f [<c049df5c>] kmem_cache_free+0x8c/0xf2 [<c042ab6e>] free_task+0x31/0x34 [<c042c37b>] __put_task_struct+0xd3/0xd8 [<c042e072>] delayed_put_task_struct+0x60/0x64 [<c0461a12>] rcu_process_callbacks+0x62/0x86 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c047f7b6>] free_pages_bulk+0x21/0x1a1 [<c047ffcf>] free_hot_cold_page+0x181/0x20f [<c04800a3>] free_hot_page+0xf/0x11 [<c04800c5>] __free_pages+0x20/0x2b [<c07c4d96>] __free_pages_bootmem+0x6d/0x71 [<c07b2244>] free_all_bootmem_core+0xd2/0x177 [<c07b22f6>] free_all_bootmem+0xd/0xf [<c07ad21a>] mem_init+0x28/0x28c [<c079d7b1>] start_kernel+0x257/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0c52628>] __key.30749+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c048035e>] get_page_from_freelist+0x236/0x3e3 [<c04805f4>] __alloc_pages_internal+0xce/0x371 [<c049cce6>] __slab_alloc+0x197/0x4ef [<c049d333>] kmem_cache_alloc+0x66/0x11f [<c047d96b>] mempool_alloc_slab+0x13/0x15 [<c047da5c>] mempool_alloc+0x3a/0xd5 [<f81063cc>] scsi_sg_alloc+0x47/0x4a [scsi_mod] [<c051cd02>] __sg_alloc_table+0x48/0xc7 [<f8106325>] scsi_init_sgtable+0x2c/0x8c [scsi_mod] [<f81064e7>] scsi_init_io+0x19/0x9b [scsi_mod] [<f8106abf>] scsi_setup_fs_cmnd+0x6f/0x73 [scsi_mod] [<f812ca73>] sd_prep_fn+0x6a/0x7d4 [sd_mod] [<c0504712>] elv_next_request+0xe6/0x18d [<f810704c>] scsi_request_fn+0x69/0x431 [scsi_mod] [<c05072af>] __generic_unplug_device+0x2e/0x31 [<c05072db>] blk_start_queueing+0x29/0x2b [<c05137b8>] elv_ioq_request_add+0x2be/0x393 [<c05048cd>] elv_insert+0x114/0x1a2 [<c05049ec>] __elv_add_request+0x91/0x96 [<c0507a00>] __make_request+0x365/0x397 [<c050635a>] generic_make_request+0x342/0x3ce [<c0507b21>] submit_bio+0xef/0xfa [<c04c6c4e>] mpage_bio_submit+0x21/0x26 [<c04c7b7f>] mpage_readpages+0xa3/0xad [<f80c1ea8>] ext3_readpages+0x19/0x1b [ext3] [<c048275e>] __do_page_cache_readahead+0xfd/0x166 [<c0482b42>] do_page_cache_readahead+0x44/0x52 [<c047d665>] filemap_fault+0x197/0x3ae [<c048b9ea>] __do_fault+0x40/0x37b [<c048d43f>] handle_mm_fault+0x2bb/0x646 [<c063273c>] do_page_fault+0x29c/0x2fd [<c0630b4a>] error_code+0x72/0x78 [<ffffffff>] 0xffffffff -> (&page_address_htable[i].lock){......} ops: 6802 { INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c048af69>] page_address+0x50/0xa6 [<c048b0e7>] kmap_high+0x21/0x175 [<c041b7ef>] kmap+0x4e/0x5b [<c04abb36>] page_getlink+0x37/0x59 [<c04abb75>] page_follow_link_light+0x1d/0x2b [<c04ad4d0>] __link_path_walk+0x3d1/0xa71 [<c04adbae>] path_walk+0x3e/0x77 [<c04add0e>] do_path_lookup+0xeb/0x105 [<c04ae6f2>] path_lookup_open+0x48/0x7a [<c04a8e96>] open_exec+0x25/0xf4 [<c04a9c2d>] do_execve+0xfa/0x2cc [<c04015c0>] sys_execve+0x2b/0x54 [<c0402ae9>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff } ... key at: [<c0c5288c>] __key.28547+0x0/0x14 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c048af69>] page_address+0x50/0xa6 [<c05078a1>] __make_request+0x206/0x397 [<c050635a>] generic_make_request+0x342/0x3ce [<c0507b21>] submit_bio+0xef/0xfa [<c04c6c4e>] mpage_bio_submit+0x21/0x26 [<c04c78b8>] do_mpage_readpage+0x471/0x5e5 [<c04c7b55>] mpage_readpages+0x79/0xad [<f80c1ea8>] ext3_readpages+0x19/0x1b [ext3] [<c048275e>] __do_page_cache_readahead+0xfd/0x166 [<c0482b42>] do_page_cache_readahead+0x44/0x52 [<c047d665>] filemap_fault+0x197/0x3ae [<c048b9ea>] __do_fault+0x40/0x37b [<c048d43f>] handle_mm_fault+0x2bb/0x646 [<c063273c>] do_page_fault+0x29c/0x2fd [<c0630b4a>] error_code+0x72/0x78 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c046143d>] call_rcu+0x36/0x5b [<c050f0c8>] cfq_cic_free+0x15/0x17 [<c050f128>] cic_free_func+0x5e/0x64 [<c050ea90>] __call_for_each_cic+0x23/0x2e [<c050eaad>] cfq_free_io_context+0x12/0x14 [<c050978c>] put_io_context+0x4b/0x66 [<c050f00a>] cfq_active_ioq_reset+0x21/0x39 [<c0511044>] elv_reset_active_ioq+0x2b/0x3e [<c0512ecf>] __elv_ioq_slice_expired+0x238/0x26a [<c0512f1f>] elv_ioq_slice_expired+0x1e/0x20 [<c0513860>] elv_ioq_request_add+0x366/0x393 [<c05048cd>] elv_insert+0x114/0x1a2 [<c05049ec>] __elv_add_request+0x91/0x96 [<c0507a00>] __make_request+0x365/0x397 [<c050635a>] generic_make_request+0x342/0x3ce [<c0507b21>] submit_bio+0xef/0xfa [<c04bf495>] submit_bh+0xe3/0x102 [<c04c04b0>] ll_rw_block+0xbe/0xf7 [<f80c35ba>] ext3_bread+0x39/0x79 [ext3] [<f80c5643>] dx_probe+0x2f/0x298 [ext3] [<f80c5956>] ext3_find_entry+0xaa/0x573 [ext3] [<f80c739e>] ext3_lookup+0x31/0xbe [ext3] [<c04abf7c>] do_lookup+0xbc/0x159 [<c04ad7e8>] __link_path_walk+0x6e9/0xa71 [<c04adbae>] path_walk+0x3e/0x77 [<c04add0e>] do_path_lookup+0xeb/0x105 [<c04ae584>] user_path_at+0x41/0x6c [<c04a8301>] vfs_fstatat+0x32/0x59 [<c04a8417>] vfs_stat+0x18/0x1a [<c04a8432>] sys_stat64+0x19/0x2d [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff -> (&iocg->lock){+.+...} ops: 3 { HARDIRQ-ON-W at: [<c044b840>] mark_held_locks+0x3d/0x58 [<c044b963>] trace_hardirqs_on_caller+0x108/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff SOFTIRQ-ON-W at: [<c044b840>] mark_held_locks+0x3d/0x58 [<c044b97c>] trace_hardirqs_on_caller+0x121/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c06304ea>] _spin_lock_irq+0x30/0x3f [<c05119bd>] io_alloc_root_group+0x104/0x155 [<c05133cb>] elv_init_fq_data+0x32/0xe0 [<c0504317>] elevator_alloc+0x150/0x170 [<c0505393>] elevator_init+0x9d/0x100 [<c0507088>] blk_init_queue_node+0xc4/0xf7 [<c05070cb>] blk_init_queue+0x10/0x12 [<f81060fd>] __scsi_alloc_queue+0x1c/0xba [scsi_mod] [<f81061b0>] scsi_alloc_queue+0x15/0x4e [scsi_mod] [<f810803d>] scsi_alloc_sdev+0x154/0x1f5 [scsi_mod] [<f8108387>] scsi_probe_and_add_lun+0x123/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c5ebd8>] __key.29462+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0510f6f>] io_group_chain_link+0x5c/0x106 [<c0511ba7>] io_find_alloc_group+0x54/0x60 [<c0511c11>] io_get_io_group_bio+0x5e/0x89 [<c0511cc3>] io_group_get_request_list+0x12/0x21 [<c0507485>] get_request_wait+0x124/0x15d [<c050797e>] __make_request+0x2e3/0x397 [<c050635a>] generic_make_request+0x342/0x3ce [<c0507b21>] submit_bio+0xef/0xfa [<c04c6c4e>] mpage_bio_submit+0x21/0x26 [<c04c7b7f>] mpage_readpages+0xa3/0xad [<f80c1ea8>] ext3_readpages+0x19/0x1b [ext3] [<c048275e>] __do_page_cache_readahead+0xfd/0x166 [<c048294a>] ondemand_readahead+0x10a/0x118 [<c04829db>] page_cache_sync_readahead+0x1b/0x20 [<c047cf37>] generic_file_aio_read+0x226/0x545 [<c04a4cf6>] do_sync_read+0xb0/0xee [<c04a54b0>] vfs_read+0x8f/0x136 [<c04a8d7c>] kernel_read+0x39/0x4b [<c04a8e69>] prepare_binprm+0xdb/0xe3 [<c04a9ca8>] do_execve+0x175/0x2cc [<c04015c0>] sys_execve+0x2b/0x54 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff stack backtrace: Pid: 2186, comm: rmdir Not tainted 2.6.30-rc4-io #6 Call Trace: [<c044b1ac>] print_irq_inversion_bug+0x13b/0x147 [<c044c3e5>] check_usage_backwards+0x7d/0x86 [<c044b5ec>] mark_lock+0x2d3/0x4ea [<c044c368>] ? check_usage_backwards+0x0/0x86 [<c044b840>] mark_held_locks+0x3d/0x58 [<c0630883>] ? _spin_unlock_irq+0x27/0x47 [<c044b97c>] trace_hardirqs_on_caller+0x121/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04029b1>] ? resume_userspace+0x11/0x28 [<c051aa14>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c0402b34>] ? restore_nocheck_notrace+0x0/0xe [<c06324a0>] ? do_page_fault+0x0/0x2fd [<c044b97c>] ? trace_hardirqs_on_caller+0x121/0x14c [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <4A027348.6000808-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 2009-05-07 5:36 ` Li Zefan @ 2009-05-08 13:37 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-08 13:37 UTC (permalink / raw) To: Li Zefan Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w On Thu, May 07, 2009 at 01:36:08PM +0800, Li Zefan wrote: > Vivek Goyal wrote: > > On Wed, May 06, 2009 at 04:11:05PM +0800, Gui Jianfeng wrote: > >> Vivek Goyal wrote: > >>> Hi All, > >>> > >>> Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > >>> First version of the patches was posted here. > >> Hi Vivek, > >> > >> I did some simple test for V2, and triggered an kernel panic. > >> The following script can reproduce this bug. It seems that the cgroup > >> is already removed, but IO Controller still try to access into it. > >> > > > > Hi Gui, > > > > Thanks for the report. I use cgroup_path() for debugging. I guess that > > cgroup_path() was passed null cgrp pointer that's why it crashed. > > > > If yes, then it is strange though. I call cgroup_path() only after > > grabbing a refenrece to css object. (I am assuming that if I have a valid > > reference to css object then css->cgrp can't be null). > > > > Yes, css->cgrp shouldn't be NULL.. I doubt we hit a bug in cgroup here. > The code dealing with css refcnt and cgroup rmdir has changed quite a lot, > and is much more complex than it was. > > > Anyway, can you please try out following patch and see if it fixes your > > crash. > ... > > BTW, I tried following equivalent script and I can't see the crash on > > my system. Are you able to hit it regularly? > > > > I modified the script like this: > > ====================== > #!/bin/sh > echo 1 > /proc/sys/vm/drop_caches > mkdir /cgroup 2> /dev/null > mount -t cgroup -o io,blkio io /cgroup > mkdir /cgroup/test1 > mkdir /cgroup/test2 > echo 100 > /cgroup/test1/io.weight > echo 500 > /cgroup/test2/io.weight > > dd if=/dev/zero bs=4096 count=128000 of=500M.1 & > pid1=$! > echo $pid1 > /cgroup/test1/tasks > > dd if=/dev/zero bs=4096 count=128000 of=500M.2 & > pid2=$! > echo $pid2 > /cgroup/test2/tasks > > sleep 5 > kill -9 $pid1 > kill -9 $pid2 > > for ((;count != 2;)) > { > rmdir /cgroup/test1 > /dev/null 2>&1 > if [ $? -eq 0 ]; then > count=$(( $count + 1 )) > fi > > rmdir /cgroup/test2 > /dev/null 2>&1 > if [ $? -eq 0 ]; then > count=$(( $count + 1 )) > fi > } > > umount /cgroup > rmdir /cgroup > ====================== > > I ran this script and got lockdep BUG. Full log and my config are attached. > > Actually this can be triggered with the following steps on my box: > # mount -t cgroup -o blkio,io xxx /mnt > # mkdir /mnt/0 > # echo $$ > /mnt/0/tasks > # echo 3 > /proc/sys/vm/drop_cache > # echo $$ > /mnt/tasks > # rmdir /mnt/0 > > And when I ran the script for the second time, my box was freezed > and I had to reset it. > Thanks Li and Gui for pointing out the problem. With you script, I could also produce lock validator warning as well as system freeze. I could identify at least two trouble spots. With following patch things seems to be fine on my system. Can you please give it a try. --- block/elevator-fq.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) Index: linux11/block/elevator-fq.c =================================================================== --- linux11.orig/block/elevator-fq.c 2009-05-08 08:47:45.000000000 -0400 +++ linux11/block/elevator-fq.c 2009-05-08 09:27:37.000000000 -0400 @@ -942,6 +942,7 @@ void entity_served(struct io_entity *ent struct io_service_tree *st; for_each_entity(entity) { + BUG_ON(!entity->on_st); st = io_entity_service_tree(entity); entity->service += served; entity->total_service += served; @@ -1652,6 +1653,14 @@ static inline int io_group_has_active_en return 1; } + /* + * Also check there are no active entities being served which are + * not on active tree + */ + + if (iog->sched_data.active_entity) + return 1; + return 0; } @@ -1738,7 +1747,7 @@ void iocg_destroy(struct cgroup_subsys * struct io_cgroup *iocg = cgroup_to_io_cgroup(cgroup); struct hlist_node *n, *tmp; struct io_group *iog; - unsigned long flags; + unsigned long flags, flags1; int queue_lock_held = 0; struct elv_fq_data *efqd; @@ -1766,7 +1775,8 @@ retry: rcu_read_lock(); efqd = rcu_dereference(iog->key); if (efqd != NULL) { - if (spin_trylock_irq(efqd->queue->queue_lock)) { + if (spin_trylock_irqsave(efqd->queue->queue_lock, + flags1)) { if (iog->key == efqd) { queue_lock_held = 1; rcu_read_unlock(); @@ -1780,7 +1790,8 @@ retry: * elevator hence we can proceed safely without * queue lock. */ - spin_unlock_irq(efqd->queue->queue_lock); + spin_unlock_irqrestore(efqd->queue->queue_lock, + flags1); } else { /* * Did not get the queue lock while trying. @@ -1803,7 +1814,7 @@ retry: locked: __iocg_destroy(iocg, iog, queue_lock_held); if (queue_lock_held) { - spin_unlock_irq(efqd->queue->queue_lock); + spin_unlock_irqrestore(efqd->queue->queue_lock, flags1); queue_lock_held = 0; } } @@ -1811,6 +1822,7 @@ locked: BUG_ON(!hlist_empty(&iocg->group_data)); + free_css_id(&io_subsys, &iocg->css); kfree(iocg); } ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 @ 2009-05-08 13:37 ` Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-08 13:37 UTC (permalink / raw) To: Li Zefan Cc: Gui Jianfeng, nauman, dpshah, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, akpm On Thu, May 07, 2009 at 01:36:08PM +0800, Li Zefan wrote: > Vivek Goyal wrote: > > On Wed, May 06, 2009 at 04:11:05PM +0800, Gui Jianfeng wrote: > >> Vivek Goyal wrote: > >>> Hi All, > >>> > >>> Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. > >>> First version of the patches was posted here. > >> Hi Vivek, > >> > >> I did some simple test for V2, and triggered an kernel panic. > >> The following script can reproduce this bug. It seems that the cgroup > >> is already removed, but IO Controller still try to access into it. > >> > > > > Hi Gui, > > > > Thanks for the report. I use cgroup_path() for debugging. I guess that > > cgroup_path() was passed null cgrp pointer that's why it crashed. > > > > If yes, then it is strange though. I call cgroup_path() only after > > grabbing a refenrece to css object. (I am assuming that if I have a valid > > reference to css object then css->cgrp can't be null). > > > > Yes, css->cgrp shouldn't be NULL.. I doubt we hit a bug in cgroup here. > The code dealing with css refcnt and cgroup rmdir has changed quite a lot, > and is much more complex than it was. > > > Anyway, can you please try out following patch and see if it fixes your > > crash. > ... > > BTW, I tried following equivalent script and I can't see the crash on > > my system. Are you able to hit it regularly? > > > > I modified the script like this: > > ====================== > #!/bin/sh > echo 1 > /proc/sys/vm/drop_caches > mkdir /cgroup 2> /dev/null > mount -t cgroup -o io,blkio io /cgroup > mkdir /cgroup/test1 > mkdir /cgroup/test2 > echo 100 > /cgroup/test1/io.weight > echo 500 > /cgroup/test2/io.weight > > dd if=/dev/zero bs=4096 count=128000 of=500M.1 & > pid1=$! > echo $pid1 > /cgroup/test1/tasks > > dd if=/dev/zero bs=4096 count=128000 of=500M.2 & > pid2=$! > echo $pid2 > /cgroup/test2/tasks > > sleep 5 > kill -9 $pid1 > kill -9 $pid2 > > for ((;count != 2;)) > { > rmdir /cgroup/test1 > /dev/null 2>&1 > if [ $? -eq 0 ]; then > count=$(( $count + 1 )) > fi > > rmdir /cgroup/test2 > /dev/null 2>&1 > if [ $? -eq 0 ]; then > count=$(( $count + 1 )) > fi > } > > umount /cgroup > rmdir /cgroup > ====================== > > I ran this script and got lockdep BUG. Full log and my config are attached. > > Actually this can be triggered with the following steps on my box: > # mount -t cgroup -o blkio,io xxx /mnt > # mkdir /mnt/0 > # echo $$ > /mnt/0/tasks > # echo 3 > /proc/sys/vm/drop_cache > # echo $$ > /mnt/tasks > # rmdir /mnt/0 > > And when I ran the script for the second time, my box was freezed > and I had to reset it. > Thanks Li and Gui for pointing out the problem. With you script, I could also produce lock validator warning as well as system freeze. I could identify at least two trouble spots. With following patch things seems to be fine on my system. Can you please give it a try. --- block/elevator-fq.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) Index: linux11/block/elevator-fq.c =================================================================== --- linux11.orig/block/elevator-fq.c 2009-05-08 08:47:45.000000000 -0400 +++ linux11/block/elevator-fq.c 2009-05-08 09:27:37.000000000 -0400 @@ -942,6 +942,7 @@ void entity_served(struct io_entity *ent struct io_service_tree *st; for_each_entity(entity) { + BUG_ON(!entity->on_st); st = io_entity_service_tree(entity); entity->service += served; entity->total_service += served; @@ -1652,6 +1653,14 @@ static inline int io_group_has_active_en return 1; } + /* + * Also check there are no active entities being served which are + * not on active tree + */ + + if (iog->sched_data.active_entity) + return 1; + return 0; } @@ -1738,7 +1747,7 @@ void iocg_destroy(struct cgroup_subsys * struct io_cgroup *iocg = cgroup_to_io_cgroup(cgroup); struct hlist_node *n, *tmp; struct io_group *iog; - unsigned long flags; + unsigned long flags, flags1; int queue_lock_held = 0; struct elv_fq_data *efqd; @@ -1766,7 +1775,8 @@ retry: rcu_read_lock(); efqd = rcu_dereference(iog->key); if (efqd != NULL) { - if (spin_trylock_irq(efqd->queue->queue_lock)) { + if (spin_trylock_irqsave(efqd->queue->queue_lock, + flags1)) { if (iog->key == efqd) { queue_lock_held = 1; rcu_read_unlock(); @@ -1780,7 +1790,8 @@ retry: * elevator hence we can proceed safely without * queue lock. */ - spin_unlock_irq(efqd->queue->queue_lock); + spin_unlock_irqrestore(efqd->queue->queue_lock, + flags1); } else { /* * Did not get the queue lock while trying. @@ -1803,7 +1814,7 @@ retry: locked: __iocg_destroy(iocg, iog, queue_lock_held); if (queue_lock_held) { - spin_unlock_irq(efqd->queue->queue_lock); + spin_unlock_irqrestore(efqd->queue->queue_lock, flags1); queue_lock_held = 0; } } @@ -1811,6 +1822,7 @@ locked: BUG_ON(!hlist_empty(&iocg->group_data)); + free_css_id(&io_subsys, &iocg->css); kfree(iocg); } ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-08 13:37 ` Vivek Goyal (?) @ 2009-05-11 2:59 ` Gui Jianfeng -1 siblings, 0 replies; 97+ messages in thread From: Gui Jianfeng @ 2009-05-11 2:59 UTC (permalink / raw) To: Vivek Goyal Cc: Li Zefan, nauman, dpshah, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, akpm Vivek Goyal wrote: ... >> > > Thanks Li and Gui for pointing out the problem. With you script, I could > also produce lock validator warning as well as system freeze. I could > identify at least two trouble spots. With following patch things seems > to be fine on my system. Can you please give it a try. Hi Vivek, I'v tried this patch, and seems the problem is addressed. Thanks. -- Regards Gui Jianfeng ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090508133740.GD7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090508133740.GD7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-11 2:59 ` Gui Jianfeng 0 siblings, 0 replies; 97+ messages in thread From: Gui Jianfeng @ 2009-05-11 2:59 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w Vivek Goyal wrote: ... >> > > Thanks Li and Gui for pointing out the problem. With you script, I could > also produce lock validator warning as well as system freeze. I could > identify at least two trouble spots. With following patch things seems > to be fine on my system. Can you please give it a try. Hi Vivek, I'v tried this patch, and seems the problem is addressed. Thanks. -- Regards Gui Jianfeng ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 2009-05-06 16:10 ` Vivek Goyal (?) (?) @ 2009-05-07 5:47 ` Gui Jianfeng -1 siblings, 0 replies; 97+ messages in thread From: Gui Jianfeng @ 2009-05-07 5:47 UTC (permalink / raw) To: Vivek Goyal Cc: nauman, dpshah, lizf, mikew, fchecconi, paolo.valente, jens.axboe, ryov, fernando, s-uchida, taka, jmoyer, dhaval, balbir, linux-kernel, containers, righi.andrea, agk, dm-devel, snitzer, m-ikeda, akpm [-- Attachment #1: Type: text/plain, Size: 2218 bytes --] Vivek Goyal wrote: > Hi Gui, > > Thanks for the report. I use cgroup_path() for debugging. I guess that > cgroup_path() was passed null cgrp pointer that's why it crashed. > > If yes, then it is strange though. I call cgroup_path() only after > grabbing a refenrece to css object. (I am assuming that if I have a valid > reference to css object then css->cgrp can't be null). I think so too... > > Anyway, can you please try out following patch and see if it fixes your > crash. > > --- > block/elevator-fq.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > Index: linux11/block/elevator-fq.c > =================================================================== > --- linux11.orig/block/elevator-fq.c 2009-05-05 15:38:06.000000000 -0400 > +++ linux11/block/elevator-fq.c 2009-05-06 11:55:47.000000000 -0400 > @@ -125,6 +125,9 @@ static void io_group_path(struct io_grou > unsigned short id = iog->iocg_id; > struct cgroup_subsys_state *css; > > + /* For error case */ > + buf[0] = '\0'; > + > rcu_read_lock(); > > if (!id) > @@ -137,15 +140,12 @@ static void io_group_path(struct io_grou > if (!css_tryget(css)) > goto out; > > - cgroup_path(css->cgroup, buf, buflen); > + if (css->cgroup) According to CR2, when kernel crashing, css->cgroup equals 0x00000100. So i guess this patch won't fix this issue. > + cgroup_path(css->cgroup, buf, buflen); > > css_put(css); > - > - rcu_read_unlock(); > - return; > out: > rcu_read_unlock(); > - buf[0] = '\0'; > return; > } > #endif > > BTW, I tried following equivalent script and I can't see the crash on > my system. Are you able to hit it regularly? yes, it's 50% chance that i can reproduce it. i'v attached the rwio source code. > > Instead of killing the tasks I also tried moving the tasks into root cgroup > and then deleting test1 and test2 groups, that also did not produce any crash. > (Hit a different bug though after 5-6 attempts :-) > > As I mentioned in the patchset, currently we do have issues with group > refcounting and cgroup/group going away. Hopefully in next version they > all should be fixed up. But still, it is nice to hear back... > > -- Regards Gui Jianfeng [-- Attachment #2: rwio.c --] [-- Type: image/x-xbitmap, Size: 1613 bytes --] ^ permalink raw reply [flat|nested] 97+ messages in thread
[parent not found: <20090506161012.GC8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>]
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506161012.GC8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> @ 2009-05-07 5:36 ` Li Zefan 2009-05-07 5:47 ` Gui Jianfeng 1 sibling, 0 replies; 97+ messages in thread From: Li Zefan @ 2009-05-07 5:36 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w [-- Attachment #1: Type: text/plain, Size: 2886 bytes --] Vivek Goyal wrote: > On Wed, May 06, 2009 at 04:11:05PM +0800, Gui Jianfeng wrote: >> Vivek Goyal wrote: >>> Hi All, >>> >>> Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. >>> First version of the patches was posted here. >> Hi Vivek, >> >> I did some simple test for V2, and triggered an kernel panic. >> The following script can reproduce this bug. It seems that the cgroup >> is already removed, but IO Controller still try to access into it. >> > > Hi Gui, > > Thanks for the report. I use cgroup_path() for debugging. I guess that > cgroup_path() was passed null cgrp pointer that's why it crashed. > > If yes, then it is strange though. I call cgroup_path() only after > grabbing a refenrece to css object. (I am assuming that if I have a valid > reference to css object then css->cgrp can't be null). > Yes, css->cgrp shouldn't be NULL.. I doubt we hit a bug in cgroup here. The code dealing with css refcnt and cgroup rmdir has changed quite a lot, and is much more complex than it was. > Anyway, can you please try out following patch and see if it fixes your > crash. ... > BTW, I tried following equivalent script and I can't see the crash on > my system. Are you able to hit it regularly? > I modified the script like this: ====================== #!/bin/sh echo 1 > /proc/sys/vm/drop_caches mkdir /cgroup 2> /dev/null mount -t cgroup -o io,blkio io /cgroup mkdir /cgroup/test1 mkdir /cgroup/test2 echo 100 > /cgroup/test1/io.weight echo 500 > /cgroup/test2/io.weight dd if=/dev/zero bs=4096 count=128000 of=500M.1 & pid1=$! echo $pid1 > /cgroup/test1/tasks dd if=/dev/zero bs=4096 count=128000 of=500M.2 & pid2=$! echo $pid2 > /cgroup/test2/tasks sleep 5 kill -9 $pid1 kill -9 $pid2 for ((;count != 2;)) { rmdir /cgroup/test1 > /dev/null 2>&1 if [ $? -eq 0 ]; then count=$(( $count + 1 )) fi rmdir /cgroup/test2 > /dev/null 2>&1 if [ $? -eq 0 ]; then count=$(( $count + 1 )) fi } umount /cgroup rmdir /cgroup ====================== I ran this script and got lockdep BUG. Full log and my config are attached. Actually this can be triggered with the following steps on my box: # mount -t cgroup -o blkio,io xxx /mnt # mkdir /mnt/0 # echo $$ > /mnt/0/tasks # echo 3 > /proc/sys/vm/drop_cache # echo $$ > /mnt/tasks # rmdir /mnt/0 And when I ran the script for the second time, my box was freezed and I had to reset it. > Instead of killing the tasks I also tried moving the tasks into root cgroup > and then deleting test1 and test2 groups, that also did not produce any crash. > (Hit a different bug though after 5-6 attempts :-) > > As I mentioned in the patchset, currently we do have issues with group > refcounting and cgroup/group going away. Hopefully in next version they > all should be fixed up. But still, it is nice to hear back... > [-- Attachment #2: myconfig --] [-- Type: text/plain, Size: 64514 bytes --] # # Automatically generated make config: don't edit # Linux kernel version: 2.6.30-rc4 # Thu May 7 09:11:29 2009 # # CONFIG_64BIT is not set CONFIG_X86_32=y # CONFIG_X86_64 is not set CONFIG_X86=y CONFIG_ARCH_DEFCONFIG="arch/x86/configs/i386_defconfig" CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_HAVE_LATENCYTOP_SUPPORT=y CONFIG_FAST_CMPXCHG_LOCAL=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y # CONFIG_RWSEM_GENERIC_SPINLOCK is not set CONFIG_RWSEM_XCHGADD_ALGORITHM=y CONFIG_ARCH_HAS_CPU_IDLE_WAIT=y CONFIG_GENERIC_CALIBRATE_DELAY=y # CONFIG_GENERIC_TIME_VSYSCALL is not set CONFIG_ARCH_HAS_CPU_RELAX=y CONFIG_ARCH_HAS_DEFAULT_IDLE=y CONFIG_ARCH_HAS_CACHE_LINE_SIZE=y CONFIG_HAVE_SETUP_PER_CPU_AREA=y CONFIG_HAVE_DYNAMIC_PER_CPU_AREA=y # CONFIG_HAVE_CPUMASK_OF_CPU_MAP is not set CONFIG_ARCH_HIBERNATION_POSSIBLE=y CONFIG_ARCH_SUSPEND_POSSIBLE=y # CONFIG_ZONE_DMA32 is not set CONFIG_ARCH_POPULATES_NODE_MAP=y # CONFIG_AUDIT_ARCH is not set CONFIG_ARCH_SUPPORTS_OPTIMIZED_INLINING=y CONFIG_ARCH_SUPPORTS_DEBUG_PAGEALLOC=y CONFIG_GENERIC_HARDIRQS=y CONFIG_GENERIC_HARDIRQS_NO__DO_IRQ=y CONFIG_GENERIC_IRQ_PROBE=y CONFIG_GENERIC_PENDING_IRQ=y CONFIG_USE_GENERIC_SMP_HELPERS=y CONFIG_X86_32_SMP=y CONFIG_X86_HT=y CONFIG_X86_TRAMPOLINE=y CONFIG_X86_32_LAZY_GS=y CONFIG_KTIME_SCALAR=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_HAVE_KERNEL_GZIP=y CONFIG_HAVE_KERNEL_BZIP2=y CONFIG_HAVE_KERNEL_LZMA=y CONFIG_KERNEL_GZIP=y # CONFIG_KERNEL_BZIP2 is not set # CONFIG_KERNEL_LZMA is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y CONFIG_POSIX_MQUEUE_SYSCTL=y CONFIG_BSD_PROCESS_ACCT=y # CONFIG_BSD_PROCESS_ACCT_V3 is not set CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_XACCT=y CONFIG_TASK_IO_ACCOUNTING=y # CONFIG_AUDIT is not set # # RCU Subsystem # # CONFIG_CLASSIC_RCU is not set # CONFIG_TREE_RCU is not set CONFIG_PREEMPT_RCU=y CONFIG_RCU_TRACE=y # CONFIG_TREE_RCU_TRACE is not set CONFIG_PREEMPT_RCU_TRACE=y # CONFIG_IKCONFIG is not set CONFIG_LOG_BUF_SHIFT=17 CONFIG_HAVE_UNSTABLE_SCHED_CLOCK=y CONFIG_GROUP_SCHED=y CONFIG_FAIR_GROUP_SCHED=y CONFIG_RT_GROUP_SCHED=y # CONFIG_USER_SCHED is not set CONFIG_CGROUP_SCHED=y CONFIG_CGROUPS=y CONFIG_CGROUP_DEBUG=y CONFIG_CGROUP_NS=y CONFIG_CGROUP_FREEZER=y CONFIG_CGROUP_DEVICE=y CONFIG_CPUSETS=y CONFIG_PROC_PID_CPUSET=y CONFIG_CGROUP_CPUACCT=y CONFIG_RESOURCE_COUNTERS=y CONFIG_CGROUP_MEM_RES_CTLR=y CONFIG_CGROUP_MEM_RES_CTLR_SWAP=y CONFIG_GROUP_IOSCHED=y CONFIG_CGROUP_BLKIO=y CONFIG_CGROUP_PAGE=y CONFIG_MM_OWNER=y CONFIG_SYSFS_DEPRECATED=y CONFIG_SYSFS_DEPRECATED_V2=y CONFIG_RELAY=y CONFIG_NAMESPACES=y # CONFIG_UTS_NS is not set # CONFIG_IPC_NS is not set CONFIG_USER_NS=y CONFIG_PID_NS=y # CONFIG_NET_NS is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" CONFIG_RD_GZIP=y CONFIG_RD_BZIP2=y CONFIG_RD_LZMA=y CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_ANON_INODES=y # CONFIG_EMBEDDED is not set CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y CONFIG_KALLSYMS_ALL=y CONFIG_KALLSYMS_EXTRA_PASS=y # CONFIG_STRIP_ASM_SYMS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_PCSPKR_PLATFORM=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_TIMERFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_AIO=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_PCI_QUIRKS=y CONFIG_SLUB_DEBUG=y CONFIG_COMPAT_BRK=y # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_PROFILING=y CONFIG_TRACEPOINTS=y CONFIG_MARKERS=y CONFIG_OPROFILE=m # CONFIG_OPROFILE_IBS is not set CONFIG_HAVE_OPROFILE=y CONFIG_KPROBES=y CONFIG_HAVE_EFFICIENT_UNALIGNED_ACCESS=y CONFIG_KRETPROBES=y CONFIG_HAVE_IOREMAP_PROT=y CONFIG_HAVE_KPROBES=y CONFIG_HAVE_KRETPROBES=y CONFIG_HAVE_ARCH_TRACEHOOK=y CONFIG_HAVE_DMA_API_DEBUG=y # CONFIG_SLOW_WORK is not set CONFIG_HAVE_GENERIC_DMA_COHERENT=y CONFIG_SLABINFO=y CONFIG_RT_MUTEXES=y CONFIG_BASE_SMALL=0 CONFIG_MODULES=y # CONFIG_MODULE_FORCE_LOAD is not set CONFIG_MODULE_UNLOAD=y # CONFIG_MODULE_FORCE_UNLOAD is not set # CONFIG_MODVERSIONS is not set # CONFIG_MODULE_SRCVERSION_ALL is not set CONFIG_STOP_MACHINE=y CONFIG_BLOCK=y CONFIG_LBD=y CONFIG_BLK_DEV_BSG=y # CONFIG_BLK_DEV_INTEGRITY is not set # # IO Schedulers # CONFIG_ELV_FAIR_QUEUING=y CONFIG_IOSCHED_NOOP=y CONFIG_IOSCHED_NOOP_HIER=y CONFIG_IOSCHED_AS=m CONFIG_IOSCHED_AS_HIER=y CONFIG_IOSCHED_DEADLINE=m CONFIG_IOSCHED_DEADLINE_HIER=y CONFIG_IOSCHED_CFQ=y CONFIG_IOSCHED_CFQ_HIER=y # CONFIG_DEFAULT_AS is not set # CONFIG_DEFAULT_DEADLINE is not set CONFIG_DEFAULT_CFQ=y # CONFIG_DEFAULT_NOOP is not set CONFIG_DEFAULT_IOSCHED="cfq" CONFIG_TRACK_ASYNC_CONTEXT=y CONFIG_DEBUG_GROUP_IOSCHED=y CONFIG_FREEZER=y # # Processor type and features # CONFIG_TICK_ONESHOT=y CONFIG_NO_HZ=y CONFIG_HIGH_RES_TIMERS=y CONFIG_GENERIC_CLOCKEVENTS_BUILD=y CONFIG_SMP=y # CONFIG_SPARSE_IRQ is not set CONFIG_X86_MPPARSE=y # CONFIG_X86_BIGSMP is not set CONFIG_X86_EXTENDED_PLATFORM=y # CONFIG_X86_ELAN is not set # CONFIG_X86_RDC321X is not set # CONFIG_X86_32_NON_STANDARD is not set CONFIG_SCHED_OMIT_FRAME_POINTER=y # CONFIG_PARAVIRT_GUEST is not set # CONFIG_MEMTEST is not set # CONFIG_M386 is not set # CONFIG_M486 is not set # CONFIG_M586 is not set # CONFIG_M586TSC is not set # CONFIG_M586MMX is not set CONFIG_M686=y # CONFIG_MPENTIUMII is not set # CONFIG_MPENTIUMIII is not set # CONFIG_MPENTIUMM is not set # CONFIG_MPENTIUM4 is not set # CONFIG_MK6 is not set # CONFIG_MK7 is not set # CONFIG_MK8 is not set # CONFIG_MCRUSOE is not set # CONFIG_MEFFICEON is not set # CONFIG_MWINCHIPC6 is not set # CONFIG_MWINCHIP3D is not set # CONFIG_MGEODEGX1 is not set # CONFIG_MGEODE_LX is not set # CONFIG_MCYRIXIII is not set # CONFIG_MVIAC3_2 is not set # CONFIG_MVIAC7 is not set # CONFIG_MPSC is not set # CONFIG_MCORE2 is not set # CONFIG_GENERIC_CPU is not set CONFIG_X86_GENERIC=y CONFIG_X86_CPU=y CONFIG_X86_L1_CACHE_BYTES=64 CONFIG_X86_INTERNODE_CACHE_BYTES=64 CONFIG_X86_CMPXCHG=y CONFIG_X86_L1_CACHE_SHIFT=5 CONFIG_X86_XADD=y CONFIG_X86_PPRO_FENCE=y CONFIG_X86_WP_WORKS_OK=y CONFIG_X86_INVLPG=y CONFIG_X86_BSWAP=y CONFIG_X86_POPAD_OK=y CONFIG_X86_INTEL_USERCOPY=y CONFIG_X86_USE_PPRO_CHECKSUM=y CONFIG_X86_TSC=y CONFIG_X86_CMOV=y CONFIG_X86_MINIMUM_CPU_FAMILY=4 CONFIG_X86_DEBUGCTLMSR=y CONFIG_CPU_SUP_INTEL=y CONFIG_CPU_SUP_CYRIX_32=y CONFIG_CPU_SUP_AMD=y CONFIG_CPU_SUP_CENTAUR=y CONFIG_CPU_SUP_TRANSMETA_32=y CONFIG_CPU_SUP_UMC_32=y # CONFIG_X86_DS is not set CONFIG_HPET_TIMER=y CONFIG_HPET_EMULATE_RTC=y CONFIG_DMI=y # CONFIG_IOMMU_HELPER is not set # CONFIG_IOMMU_API is not set CONFIG_NR_CPUS=8 # CONFIG_SCHED_SMT is not set CONFIG_SCHED_MC=y # CONFIG_PREEMPT_NONE is not set # CONFIG_PREEMPT_VOLUNTARY is not set CONFIG_PREEMPT=y CONFIG_X86_LOCAL_APIC=y CONFIG_X86_IO_APIC=y # CONFIG_X86_REROUTE_FOR_BROKEN_BOOT_IRQS is not set CONFIG_X86_MCE=y # CONFIG_X86_MCE_NONFATAL is not set # CONFIG_X86_MCE_P4THERMAL is not set CONFIG_VM86=y # CONFIG_TOSHIBA is not set # CONFIG_I8K is not set # CONFIG_X86_REBOOTFIXUPS is not set # CONFIG_MICROCODE is not set CONFIG_X86_MSR=m CONFIG_X86_CPUID=m # CONFIG_X86_CPU_DEBUG is not set # CONFIG_NOHIGHMEM is not set CONFIG_HIGHMEM4G=y # CONFIG_HIGHMEM64G is not set CONFIG_PAGE_OFFSET=0xC0000000 CONFIG_HIGHMEM=y # CONFIG_ARCH_PHYS_ADDR_T_64BIT is not set CONFIG_ARCH_FLATMEM_ENABLE=y CONFIG_ARCH_SPARSEMEM_ENABLE=y CONFIG_ARCH_SELECT_MEMORY_MODEL=y CONFIG_SELECT_MEMORY_MODEL=y CONFIG_FLATMEM_MANUAL=y # CONFIG_DISCONTIGMEM_MANUAL is not set # CONFIG_SPARSEMEM_MANUAL is not set CONFIG_FLATMEM=y CONFIG_FLAT_NODE_MEM_MAP=y CONFIG_SPARSEMEM_STATIC=y CONFIG_PAGEFLAGS_EXTENDED=y CONFIG_SPLIT_PTLOCK_CPUS=4 # CONFIG_PHYS_ADDR_T_64BIT is not set CONFIG_ZONE_DMA_FLAG=1 CONFIG_BOUNCE=y CONFIG_VIRT_TO_BUS=y CONFIG_UNEVICTABLE_LRU=y CONFIG_HAVE_MLOCK=y CONFIG_HAVE_MLOCKED_PAGE_BIT=y CONFIG_HIGHPTE=y # CONFIG_X86_CHECK_BIOS_CORRUPTION is not set CONFIG_X86_RESERVE_LOW_64K=y # CONFIG_MATH_EMULATION is not set CONFIG_MTRR=y CONFIG_MTRR_SANITIZER=y CONFIG_MTRR_SANITIZER_ENABLE_DEFAULT=0 CONFIG_MTRR_SANITIZER_SPARE_REG_NR_DEFAULT=1 # CONFIG_X86_PAT is not set CONFIG_EFI=y CONFIG_SECCOMP=y # CONFIG_CC_STACKPROTECTOR is not set # CONFIG_HZ_100 is not set # CONFIG_HZ_250 is not set # CONFIG_HZ_300 is not set CONFIG_HZ_1000=y CONFIG_HZ=1000 CONFIG_SCHED_HRTICK=y CONFIG_KEXEC=y CONFIG_CRASH_DUMP=y CONFIG_PHYSICAL_START=0x1000000 CONFIG_RELOCATABLE=y CONFIG_PHYSICAL_ALIGN=0x400000 CONFIG_HOTPLUG_CPU=y # CONFIG_COMPAT_VDSO is not set # CONFIG_CMDLINE_BOOL is not set CONFIG_ARCH_ENABLE_MEMORY_HOTPLUG=y # # Power management and ACPI options # CONFIG_PM=y CONFIG_PM_DEBUG=y # CONFIG_PM_VERBOSE is not set CONFIG_CAN_PM_TRACE=y # CONFIG_PM_TRACE_RTC is not set CONFIG_PM_SLEEP_SMP=y CONFIG_PM_SLEEP=y CONFIG_SUSPEND=y CONFIG_SUSPEND_FREEZER=y # CONFIG_HIBERNATION is not set CONFIG_ACPI=y CONFIG_ACPI_SLEEP=y # CONFIG_ACPI_PROCFS is not set # CONFIG_ACPI_PROCFS_POWER is not set CONFIG_ACPI_SYSFS_POWER=y # CONFIG_ACPI_PROC_EVENT is not set CONFIG_ACPI_AC=m # CONFIG_ACPI_BATTERY is not set CONFIG_ACPI_BUTTON=m CONFIG_ACPI_VIDEO=m CONFIG_ACPI_FAN=y CONFIG_ACPI_DOCK=y CONFIG_ACPI_PROCESSOR=y CONFIG_ACPI_HOTPLUG_CPU=y CONFIG_ACPI_THERMAL=y # CONFIG_ACPI_CUSTOM_DSDT is not set CONFIG_ACPI_BLACKLIST_YEAR=1999 # CONFIG_ACPI_DEBUG is not set # CONFIG_ACPI_PCI_SLOT is not set CONFIG_X86_PM_TIMER=y CONFIG_ACPI_CONTAINER=y # CONFIG_ACPI_SBS is not set CONFIG_X86_APM_BOOT=y CONFIG_APM=y # CONFIG_APM_IGNORE_USER_SUSPEND is not set # CONFIG_APM_DO_ENABLE is not set CONFIG_APM_CPU_IDLE=y # CONFIG_APM_DISPLAY_BLANK is not set # CONFIG_APM_ALLOW_INTS is not set # # CPU Frequency scaling # CONFIG_CPU_FREQ=y CONFIG_CPU_FREQ_TABLE=y CONFIG_CPU_FREQ_DEBUG=y CONFIG_CPU_FREQ_STAT=m CONFIG_CPU_FREQ_STAT_DETAILS=y # CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_POWERSAVE is not set CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y # CONFIG_CPU_FREQ_DEFAULT_GOV_ONDEMAND is not set # CONFIG_CPU_FREQ_DEFAULT_GOV_CONSERVATIVE is not set CONFIG_CPU_FREQ_GOV_PERFORMANCE=y CONFIG_CPU_FREQ_GOV_POWERSAVE=m CONFIG_CPU_FREQ_GOV_USERSPACE=y CONFIG_CPU_FREQ_GOV_ONDEMAND=m CONFIG_CPU_FREQ_GOV_CONSERVATIVE=m # # CPUFreq processor drivers # # CONFIG_X86_ACPI_CPUFREQ is not set # CONFIG_X86_POWERNOW_K6 is not set # CONFIG_X86_POWERNOW_K7 is not set # CONFIG_X86_POWERNOW_K8 is not set # CONFIG_X86_GX_SUSPMOD is not set # CONFIG_X86_SPEEDSTEP_CENTRINO is not set CONFIG_X86_SPEEDSTEP_ICH=y CONFIG_X86_SPEEDSTEP_SMI=y # CONFIG_X86_P4_CLOCKMOD is not set # CONFIG_X86_CPUFREQ_NFORCE2 is not set # CONFIG_X86_LONGRUN is not set # CONFIG_X86_LONGHAUL is not set # CONFIG_X86_E_POWERSAVER is not set # # shared options # CONFIG_X86_SPEEDSTEP_LIB=y # CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK is not set CONFIG_CPU_IDLE=y CONFIG_CPU_IDLE_GOV_LADDER=y CONFIG_CPU_IDLE_GOV_MENU=y # # Bus options (PCI etc.) # CONFIG_PCI=y # CONFIG_PCI_GOBIOS is not set # CONFIG_PCI_GOMMCONFIG is not set # CONFIG_PCI_GODIRECT is not set # CONFIG_PCI_GOOLPC is not set CONFIG_PCI_GOANY=y CONFIG_PCI_BIOS=y CONFIG_PCI_DIRECT=y CONFIG_PCI_MMCONFIG=y CONFIG_PCI_DOMAINS=y CONFIG_PCIEPORTBUS=y CONFIG_HOTPLUG_PCI_PCIE=m CONFIG_PCIEAER=y # CONFIG_PCIEASPM is not set CONFIG_ARCH_SUPPORTS_MSI=y # CONFIG_PCI_MSI is not set CONFIG_PCI_LEGACY=y # CONFIG_PCI_DEBUG is not set # CONFIG_PCI_STUB is not set CONFIG_HT_IRQ=y # CONFIG_PCI_IOV is not set CONFIG_ISA_DMA_API=y CONFIG_ISA=y # CONFIG_EISA is not set # CONFIG_MCA is not set # CONFIG_SCx200 is not set # CONFIG_OLPC is not set CONFIG_PCCARD=y # CONFIG_PCMCIA_DEBUG is not set CONFIG_PCMCIA=y CONFIG_PCMCIA_LOAD_CIS=y # CONFIG_PCMCIA_IOCTL is not set CONFIG_CARDBUS=y # # PC-card bridges # CONFIG_YENTA=y CONFIG_YENTA_O2=y CONFIG_YENTA_RICOH=y CONFIG_YENTA_TI=y CONFIG_YENTA_ENE_TUNE=y CONFIG_YENTA_TOSHIBA=y # CONFIG_PD6729 is not set # CONFIG_I82092 is not set # CONFIG_I82365 is not set # CONFIG_TCIC is not set CONFIG_PCMCIA_PROBE=y CONFIG_PCCARD_NONSTATIC=y CONFIG_HOTPLUG_PCI=y CONFIG_HOTPLUG_PCI_FAKE=m # CONFIG_HOTPLUG_PCI_COMPAQ is not set # CONFIG_HOTPLUG_PCI_IBM is not set CONFIG_HOTPLUG_PCI_ACPI=m CONFIG_HOTPLUG_PCI_ACPI_IBM=m # CONFIG_HOTPLUG_PCI_CPCI is not set # CONFIG_HOTPLUG_PCI_SHPC is not set # # Executable file formats / Emulations # CONFIG_BINFMT_ELF=y # CONFIG_CORE_DUMP_DEFAULT_ELF_HEADERS is not set CONFIG_HAVE_AOUT=y # CONFIG_BINFMT_AOUT is not set CONFIG_BINFMT_MISC=y CONFIG_HAVE_ATOMIC_IOMAP=y CONFIG_NET=y # # Networking options # CONFIG_PACKET=y CONFIG_PACKET_MMAP=y CONFIG_UNIX=y # CONFIG_NET_KEY is not set CONFIG_INET=y CONFIG_IP_MULTICAST=y CONFIG_IP_ADVANCED_ROUTER=y CONFIG_ASK_IP_FIB_HASH=y # CONFIG_IP_FIB_TRIE is not set CONFIG_IP_FIB_HASH=y CONFIG_IP_MULTIPLE_TABLES=y CONFIG_IP_ROUTE_MULTIPATH=y CONFIG_IP_ROUTE_VERBOSE=y # CONFIG_IP_PNP is not set CONFIG_NET_IPIP=m # CONFIG_NET_IPGRE is not set CONFIG_IP_MROUTE=y CONFIG_IP_PIMSM_V1=y CONFIG_IP_PIMSM_V2=y # CONFIG_ARPD is not set CONFIG_SYN_COOKIES=y # CONFIG_INET_AH is not set # CONFIG_INET_ESP is not set # CONFIG_INET_IPCOMP is not set # CONFIG_INET_XFRM_TUNNEL is not set CONFIG_INET_TUNNEL=m # CONFIG_INET_XFRM_MODE_TRANSPORT is not set # CONFIG_INET_XFRM_MODE_TUNNEL is not set # CONFIG_INET_XFRM_MODE_BEET is not set CONFIG_INET_LRO=m CONFIG_INET_DIAG=m CONFIG_INET_TCP_DIAG=m CONFIG_TCP_CONG_ADVANCED=y CONFIG_TCP_CONG_BIC=m CONFIG_TCP_CONG_CUBIC=y # CONFIG_TCP_CONG_WESTWOOD is not set # CONFIG_TCP_CONG_HTCP is not set CONFIG_TCP_CONG_HSTCP=m CONFIG_TCP_CONG_HYBLA=m # CONFIG_TCP_CONG_VEGAS is not set CONFIG_TCP_CONG_SCALABLE=m CONFIG_TCP_CONG_LP=m # CONFIG_TCP_CONG_VENO is not set # CONFIG_TCP_CONG_YEAH is not set CONFIG_TCP_CONG_ILLINOIS=m # CONFIG_DEFAULT_BIC is not set CONFIG_DEFAULT_CUBIC=y # CONFIG_DEFAULT_HTCP is not set # CONFIG_DEFAULT_VEGAS is not set # CONFIG_DEFAULT_WESTWOOD is not set # CONFIG_DEFAULT_RENO is not set CONFIG_DEFAULT_TCP_CONG="cubic" # CONFIG_TCP_MD5SIG is not set # CONFIG_IPV6 is not set # CONFIG_NETWORK_SECMARK is not set # CONFIG_NETFILTER is not set # CONFIG_IP_DCCP is not set # CONFIG_IP_SCTP is not set # CONFIG_TIPC is not set # CONFIG_ATM is not set CONFIG_STP=m CONFIG_BRIDGE=m # CONFIG_NET_DSA is not set # CONFIG_VLAN_8021Q is not set # CONFIG_DECNET is not set CONFIG_LLC=m # CONFIG_LLC2 is not set # CONFIG_IPX is not set # CONFIG_ATALK is not set # CONFIG_X25 is not set # CONFIG_LAPB is not set # CONFIG_ECONET is not set # CONFIG_WAN_ROUTER is not set # CONFIG_PHONET is not set CONFIG_NET_SCHED=y # # Queueing/Scheduling # # CONFIG_NET_SCH_CBQ is not set # CONFIG_NET_SCH_HTB is not set # CONFIG_NET_SCH_HFSC is not set # CONFIG_NET_SCH_PRIO is not set # CONFIG_NET_SCH_MULTIQ is not set # CONFIG_NET_SCH_RED is not set # CONFIG_NET_SCH_SFQ is not set # CONFIG_NET_SCH_TEQL is not set # CONFIG_NET_SCH_TBF is not set # CONFIG_NET_SCH_GRED is not set # CONFIG_NET_SCH_DSMARK is not set # CONFIG_NET_SCH_NETEM is not set # CONFIG_NET_SCH_DRR is not set # # Classification # CONFIG_NET_CLS=y # CONFIG_NET_CLS_BASIC is not set # CONFIG_NET_CLS_TCINDEX is not set # CONFIG_NET_CLS_ROUTE4 is not set # CONFIG_NET_CLS_FW is not set # CONFIG_NET_CLS_U32 is not set # CONFIG_NET_CLS_RSVP is not set # CONFIG_NET_CLS_RSVP6 is not set # CONFIG_NET_CLS_FLOW is not set CONFIG_NET_CLS_CGROUP=y # CONFIG_NET_EMATCH is not set # CONFIG_NET_CLS_ACT is not set CONFIG_NET_SCH_FIFO=y # CONFIG_DCB is not set # # Network testing # # CONFIG_NET_PKTGEN is not set # CONFIG_NET_TCPPROBE is not set # CONFIG_NET_DROP_MONITOR is not set # CONFIG_HAMRADIO is not set # CONFIG_CAN is not set # CONFIG_IRDA is not set # CONFIG_BT is not set # CONFIG_AF_RXRPC is not set CONFIG_FIB_RULES=y # CONFIG_WIRELESS is not set # CONFIG_WIMAX is not set # CONFIG_RFKILL is not set # CONFIG_NET_9P is not set # # Device Drivers # # # Generic Driver Options # CONFIG_UEVENT_HELPER_PATH="/sbin/hotplug" CONFIG_STANDALONE=y CONFIG_PREVENT_FIRMWARE_BUILD=y CONFIG_FW_LOADER=y CONFIG_FIRMWARE_IN_KERNEL=y CONFIG_EXTRA_FIRMWARE="" # CONFIG_DEBUG_DRIVER is not set CONFIG_DEBUG_DEVRES=y # CONFIG_SYS_HYPERVISOR is not set # CONFIG_CONNECTOR is not set # CONFIG_MTD is not set CONFIG_PARPORT=m CONFIG_PARPORT_PC=m CONFIG_PARPORT_SERIAL=m # CONFIG_PARPORT_PC_FIFO is not set # CONFIG_PARPORT_PC_SUPERIO is not set CONFIG_PARPORT_PC_PCMCIA=m # CONFIG_PARPORT_GSC is not set # CONFIG_PARPORT_AX88796 is not set CONFIG_PARPORT_1284=y CONFIG_PNP=y CONFIG_PNP_DEBUG_MESSAGES=y # # Protocols # CONFIG_ISAPNP=y # CONFIG_PNPBIOS is not set CONFIG_PNPACPI=y CONFIG_BLK_DEV=y # CONFIG_BLK_DEV_FD is not set # CONFIG_BLK_DEV_XD is not set CONFIG_PARIDE=m # # Parallel IDE high-level drivers # CONFIG_PARIDE_PD=m CONFIG_PARIDE_PCD=m CONFIG_PARIDE_PF=m # CONFIG_PARIDE_PT is not set CONFIG_PARIDE_PG=m # # Parallel IDE protocol modules # # CONFIG_PARIDE_ATEN is not set # CONFIG_PARIDE_BPCK is not set # CONFIG_PARIDE_BPCK6 is not set # CONFIG_PARIDE_COMM is not set # CONFIG_PARIDE_DSTR is not set # CONFIG_PARIDE_FIT2 is not set # CONFIG_PARIDE_FIT3 is not set # CONFIG_PARIDE_EPAT is not set # CONFIG_PARIDE_EPIA is not set # CONFIG_PARIDE_FRIQ is not set # CONFIG_PARIDE_FRPW is not set # CONFIG_PARIDE_KBIC is not set # CONFIG_PARIDE_KTTI is not set # CONFIG_PARIDE_ON20 is not set # CONFIG_PARIDE_ON26 is not set # CONFIG_BLK_CPQ_DA is not set # CONFIG_BLK_CPQ_CISS_DA is not set # CONFIG_BLK_DEV_DAC960 is not set # CONFIG_BLK_DEV_UMEM is not set # CONFIG_BLK_DEV_COW_COMMON is not set CONFIG_BLK_DEV_LOOP=m CONFIG_BLK_DEV_CRYPTOLOOP=m CONFIG_BLK_DEV_NBD=m # CONFIG_BLK_DEV_SX8 is not set # CONFIG_BLK_DEV_UB is not set CONFIG_BLK_DEV_RAM=y CONFIG_BLK_DEV_RAM_COUNT=16 CONFIG_BLK_DEV_RAM_SIZE=16384 # CONFIG_BLK_DEV_XIP is not set # CONFIG_CDROM_PKTCDVD is not set # CONFIG_ATA_OVER_ETH is not set # CONFIG_BLK_DEV_HD is not set CONFIG_MISC_DEVICES=y # CONFIG_IBM_ASM is not set # CONFIG_PHANTOM is not set # CONFIG_SGI_IOC4 is not set # CONFIG_TIFM_CORE is not set # CONFIG_ICS932S401 is not set # CONFIG_ENCLOSURE_SERVICES is not set # CONFIG_HP_ILO is not set # CONFIG_ISL29003 is not set # CONFIG_C2PORT is not set # # EEPROM support # # CONFIG_EEPROM_AT24 is not set # CONFIG_EEPROM_LEGACY is not set CONFIG_EEPROM_93CX6=m CONFIG_HAVE_IDE=y # CONFIG_IDE is not set # # SCSI device support # # CONFIG_RAID_ATTRS is not set CONFIG_SCSI=m CONFIG_SCSI_DMA=y CONFIG_SCSI_TGT=m CONFIG_SCSI_NETLINK=y CONFIG_SCSI_PROC_FS=y # # SCSI support type (disk, tape, CD-ROM) # CONFIG_BLK_DEV_SD=m # CONFIG_CHR_DEV_ST is not set # CONFIG_CHR_DEV_OSST is not set CONFIG_BLK_DEV_SR=m CONFIG_BLK_DEV_SR_VENDOR=y CONFIG_CHR_DEV_SG=m CONFIG_CHR_DEV_SCH=m # # Some SCSI devices (e.g. CD jukebox) support multiple LUNs # CONFIG_SCSI_MULTI_LUN=y # CONFIG_SCSI_CONSTANTS is not set CONFIG_SCSI_LOGGING=y CONFIG_SCSI_SCAN_ASYNC=y CONFIG_SCSI_WAIT_SCAN=m # # SCSI Transports # CONFIG_SCSI_SPI_ATTRS=m CONFIG_SCSI_FC_ATTRS=m # CONFIG_SCSI_FC_TGT_ATTRS is not set CONFIG_SCSI_ISCSI_ATTRS=m CONFIG_SCSI_SAS_ATTRS=m CONFIG_SCSI_SAS_LIBSAS=m CONFIG_SCSI_SAS_ATA=y CONFIG_SCSI_SAS_HOST_SMP=y # CONFIG_SCSI_SAS_LIBSAS_DEBUG is not set CONFIG_SCSI_SRP_ATTRS=m # CONFIG_SCSI_SRP_TGT_ATTRS is not set CONFIG_SCSI_LOWLEVEL=y CONFIG_ISCSI_TCP=m # CONFIG_BLK_DEV_3W_XXXX_RAID is not set # CONFIG_SCSI_3W_9XXX is not set # CONFIG_SCSI_7000FASST is not set CONFIG_SCSI_ACARD=m # CONFIG_SCSI_AHA152X is not set # CONFIG_SCSI_AHA1542 is not set # CONFIG_SCSI_AACRAID is not set CONFIG_SCSI_AIC7XXX=m CONFIG_AIC7XXX_CMDS_PER_DEVICE=4 CONFIG_AIC7XXX_RESET_DELAY_MS=15000 # CONFIG_AIC7XXX_DEBUG_ENABLE is not set CONFIG_AIC7XXX_DEBUG_MASK=0 # CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set CONFIG_SCSI_AIC7XXX_OLD=m CONFIG_SCSI_AIC79XX=m CONFIG_AIC79XX_CMDS_PER_DEVICE=4 CONFIG_AIC79XX_RESET_DELAY_MS=15000 # CONFIG_AIC79XX_DEBUG_ENABLE is not set CONFIG_AIC79XX_DEBUG_MASK=0 # CONFIG_AIC79XX_REG_PRETTY_PRINT is not set CONFIG_SCSI_AIC94XX=m # CONFIG_AIC94XX_DEBUG is not set # CONFIG_SCSI_DPT_I2O is not set CONFIG_SCSI_ADVANSYS=m # CONFIG_SCSI_IN2000 is not set # CONFIG_SCSI_ARCMSR is not set # CONFIG_MEGARAID_NEWGEN is not set # CONFIG_MEGARAID_LEGACY is not set # CONFIG_MEGARAID_SAS is not set # CONFIG_SCSI_MPT2SAS is not set # CONFIG_SCSI_HPTIOP is not set CONFIG_SCSI_BUSLOGIC=m # CONFIG_SCSI_FLASHPOINT is not set # CONFIG_LIBFC is not set # CONFIG_LIBFCOE is not set # CONFIG_FCOE is not set # CONFIG_SCSI_DMX3191D is not set # CONFIG_SCSI_DTC3280 is not set # CONFIG_SCSI_EATA is not set # CONFIG_SCSI_FUTURE_DOMAIN is not set CONFIG_SCSI_GDTH=m # CONFIG_SCSI_GENERIC_NCR5380 is not set # CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set CONFIG_SCSI_IPS=m CONFIG_SCSI_INITIO=m CONFIG_SCSI_INIA100=m CONFIG_SCSI_PPA=m CONFIG_SCSI_IMM=m # CONFIG_SCSI_IZIP_EPP16 is not set # CONFIG_SCSI_IZIP_SLOW_CTR is not set # CONFIG_SCSI_MVSAS is not set # CONFIG_SCSI_NCR53C406A is not set # CONFIG_SCSI_STEX is not set CONFIG_SCSI_SYM53C8XX_2=m CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1 CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16 CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64 CONFIG_SCSI_SYM53C8XX_MMIO=y # CONFIG_SCSI_IPR is not set # CONFIG_SCSI_PAS16 is not set # CONFIG_SCSI_QLOGIC_FAS is not set # CONFIG_SCSI_QLOGIC_1280 is not set # CONFIG_SCSI_QLA_FC is not set # CONFIG_SCSI_QLA_ISCSI is not set # CONFIG_SCSI_LPFC is not set # CONFIG_SCSI_SYM53C416 is not set # CONFIG_SCSI_DC395x is not set # CONFIG_SCSI_DC390T is not set # CONFIG_SCSI_T128 is not set # CONFIG_SCSI_U14_34F is not set # CONFIG_SCSI_ULTRASTOR is not set # CONFIG_SCSI_NSP32 is not set # CONFIG_SCSI_DEBUG is not set # CONFIG_SCSI_SRP is not set CONFIG_SCSI_LOWLEVEL_PCMCIA=y # CONFIG_PCMCIA_AHA152X is not set # CONFIG_PCMCIA_FDOMAIN is not set # CONFIG_PCMCIA_NINJA_SCSI is not set CONFIG_PCMCIA_QLOGIC=m # CONFIG_PCMCIA_SYM53C500 is not set # CONFIG_SCSI_DH is not set # CONFIG_SCSI_OSD_INITIATOR is not set CONFIG_ATA=m # CONFIG_ATA_NONSTANDARD is not set CONFIG_ATA_ACPI=y CONFIG_SATA_PMP=y CONFIG_SATA_AHCI=m # CONFIG_SATA_SIL24 is not set CONFIG_ATA_SFF=y # CONFIG_SATA_SVW is not set CONFIG_ATA_PIIX=m # CONFIG_SATA_MV is not set CONFIG_SATA_NV=m # CONFIG_PDC_ADMA is not set # CONFIG_SATA_QSTOR is not set # CONFIG_SATA_PROMISE is not set # CONFIG_SATA_SX4 is not set # CONFIG_SATA_SIL is not set CONFIG_SATA_SIS=m # CONFIG_SATA_ULI is not set # CONFIG_SATA_VIA is not set # CONFIG_SATA_VITESSE is not set # CONFIG_SATA_INIC162X is not set # CONFIG_PATA_ACPI is not set # CONFIG_PATA_ALI is not set # CONFIG_PATA_AMD is not set # CONFIG_PATA_ARTOP is not set CONFIG_PATA_ATIIXP=m # CONFIG_PATA_CMD640_PCI is not set # CONFIG_PATA_CMD64X is not set # CONFIG_PATA_CS5520 is not set # CONFIG_PATA_CS5530 is not set # CONFIG_PATA_CS5535 is not set # CONFIG_PATA_CS5536 is not set # CONFIG_PATA_CYPRESS is not set # CONFIG_PATA_EFAR is not set CONFIG_ATA_GENERIC=m # CONFIG_PATA_HPT366 is not set # CONFIG_PATA_HPT37X is not set # CONFIG_PATA_HPT3X2N is not set # CONFIG_PATA_HPT3X3 is not set # CONFIG_PATA_ISAPNP is not set # CONFIG_PATA_IT821X is not set # CONFIG_PATA_IT8213 is not set # CONFIG_PATA_JMICRON is not set # CONFIG_PATA_LEGACY is not set # CONFIG_PATA_TRIFLEX is not set # CONFIG_PATA_MARVELL is not set CONFIG_PATA_MPIIX=m # CONFIG_PATA_OLDPIIX is not set # CONFIG_PATA_NETCELL is not set # CONFIG_PATA_NINJA32 is not set # CONFIG_PATA_NS87410 is not set # CONFIG_PATA_NS87415 is not set # CONFIG_PATA_OPTI is not set # CONFIG_PATA_OPTIDMA is not set CONFIG_PATA_PCMCIA=m # CONFIG_PATA_PDC_OLD is not set # CONFIG_PATA_QDI is not set # CONFIG_PATA_RADISYS is not set # CONFIG_PATA_RZ1000 is not set # CONFIG_PATA_SC1200 is not set # CONFIG_PATA_SERVERWORKS is not set # CONFIG_PATA_PDC2027X is not set # CONFIG_PATA_SIL680 is not set CONFIG_PATA_SIS=m CONFIG_PATA_VIA=m # CONFIG_PATA_WINBOND is not set # CONFIG_PATA_WINBOND_VLB is not set # CONFIG_PATA_SCH is not set # CONFIG_MD is not set CONFIG_FUSION=y CONFIG_FUSION_SPI=m CONFIG_FUSION_FC=m # CONFIG_FUSION_SAS is not set CONFIG_FUSION_MAX_SGE=40 CONFIG_FUSION_CTL=m CONFIG_FUSION_LAN=m CONFIG_FUSION_LOGGING=y # # IEEE 1394 (FireWire) support # # # Enable only one of the two stacks, unless you know what you are doing # CONFIG_FIREWIRE=m CONFIG_FIREWIRE_OHCI=m CONFIG_FIREWIRE_OHCI_DEBUG=y CONFIG_FIREWIRE_SBP2=m # CONFIG_IEEE1394 is not set CONFIG_I2O=m # CONFIG_I2O_LCT_NOTIFY_ON_CHANGES is not set CONFIG_I2O_EXT_ADAPTEC=y CONFIG_I2O_CONFIG=m CONFIG_I2O_CONFIG_OLD_IOCTL=y CONFIG_I2O_BUS=m CONFIG_I2O_BLOCK=m CONFIG_I2O_SCSI=m CONFIG_I2O_PROC=m # CONFIG_MACINTOSH_DRIVERS is not set CONFIG_NETDEVICES=y CONFIG_COMPAT_NET_DEV_OPS=y CONFIG_DUMMY=m CONFIG_BONDING=m # CONFIG_MACVLAN is not set # CONFIG_EQUALIZER is not set CONFIG_TUN=m # CONFIG_VETH is not set # CONFIG_NET_SB1000 is not set # CONFIG_ARCNET is not set CONFIG_PHYLIB=m # # MII PHY device drivers # # CONFIG_MARVELL_PHY is not set # CONFIG_DAVICOM_PHY is not set # CONFIG_QSEMI_PHY is not set CONFIG_LXT_PHY=m # CONFIG_CICADA_PHY is not set # CONFIG_VITESSE_PHY is not set # CONFIG_SMSC_PHY is not set # CONFIG_BROADCOM_PHY is not set # CONFIG_ICPLUS_PHY is not set # CONFIG_REALTEK_PHY is not set # CONFIG_NATIONAL_PHY is not set # CONFIG_STE10XP is not set # CONFIG_LSI_ET1011C_PHY is not set # CONFIG_MDIO_BITBANG is not set CONFIG_NET_ETHERNET=y CONFIG_MII=m # CONFIG_HAPPYMEAL is not set # CONFIG_SUNGEM is not set # CONFIG_CASSINI is not set CONFIG_NET_VENDOR_3COM=y # CONFIG_EL1 is not set # CONFIG_EL2 is not set # CONFIG_ELPLUS is not set # CONFIG_EL16 is not set CONFIG_EL3=m # CONFIG_3C515 is not set CONFIG_VORTEX=m CONFIG_TYPHOON=m # CONFIG_LANCE is not set CONFIG_NET_VENDOR_SMC=y # CONFIG_WD80x3 is not set # CONFIG_ULTRA is not set # CONFIG_SMC9194 is not set # CONFIG_ETHOC is not set # CONFIG_NET_VENDOR_RACAL is not set # CONFIG_DNET is not set CONFIG_NET_TULIP=y CONFIG_DE2104X=m CONFIG_TULIP=m # CONFIG_TULIP_MWI is not set CONFIG_TULIP_MMIO=y # CONFIG_TULIP_NAPI is not set CONFIG_DE4X5=m CONFIG_WINBOND_840=m CONFIG_DM9102=m CONFIG_ULI526X=m CONFIG_PCMCIA_XIRCOM=m # CONFIG_AT1700 is not set # CONFIG_DEPCA is not set # CONFIG_HP100 is not set CONFIG_NET_ISA=y # CONFIG_E2100 is not set # CONFIG_EWRK3 is not set # CONFIG_EEXPRESS is not set # CONFIG_EEXPRESS_PRO is not set # CONFIG_HPLAN_PLUS is not set # CONFIG_HPLAN is not set # CONFIG_LP486E is not set # CONFIG_ETH16I is not set CONFIG_NE2000=m # CONFIG_ZNET is not set # CONFIG_SEEQ8005 is not set # CONFIG_IBM_NEW_EMAC_ZMII is not set # CONFIG_IBM_NEW_EMAC_RGMII is not set # CONFIG_IBM_NEW_EMAC_TAH is not set # CONFIG_IBM_NEW_EMAC_EMAC4 is not set # CONFIG_IBM_NEW_EMAC_NO_FLOW_CTRL is not set # CONFIG_IBM_NEW_EMAC_MAL_CLR_ICINTSTAT is not set # CONFIG_IBM_NEW_EMAC_MAL_COMMON_ERR is not set CONFIG_NET_PCI=y CONFIG_PCNET32=m CONFIG_AMD8111_ETH=m CONFIG_ADAPTEC_STARFIRE=m # CONFIG_AC3200 is not set # CONFIG_APRICOT is not set CONFIG_B44=m CONFIG_B44_PCI_AUTOSELECT=y CONFIG_B44_PCICORE_AUTOSELECT=y CONFIG_B44_PCI=y CONFIG_FORCEDETH=m CONFIG_FORCEDETH_NAPI=y # CONFIG_CS89x0 is not set CONFIG_E100=m # CONFIG_FEALNX is not set # CONFIG_NATSEMI is not set CONFIG_NE2K_PCI=m # CONFIG_8139CP is not set CONFIG_8139TOO=m # CONFIG_8139TOO_PIO is not set # CONFIG_8139TOO_TUNE_TWISTER is not set CONFIG_8139TOO_8129=y # CONFIG_8139_OLD_RX_RESET is not set # CONFIG_R6040 is not set CONFIG_SIS900=m # CONFIG_EPIC100 is not set # CONFIG_SMSC9420 is not set # CONFIG_SUNDANCE is not set # CONFIG_TLAN is not set CONFIG_VIA_RHINE=m CONFIG_VIA_RHINE_MMIO=y # CONFIG_SC92031 is not set CONFIG_NET_POCKET=y CONFIG_ATP=m CONFIG_DE600=m CONFIG_DE620=m # CONFIG_ATL2 is not set CONFIG_NETDEV_1000=y CONFIG_ACENIC=m # CONFIG_ACENIC_OMIT_TIGON_I is not set # CONFIG_DL2K is not set CONFIG_E1000=m CONFIG_E1000E=m # CONFIG_IP1000 is not set # CONFIG_IGB is not set # CONFIG_IGBVF is not set # CONFIG_NS83820 is not set # CONFIG_HAMACHI is not set # CONFIG_YELLOWFIN is not set CONFIG_R8169=m # CONFIG_SIS190 is not set CONFIG_SKGE=m # CONFIG_SKGE_DEBUG is not set CONFIG_SKY2=m # CONFIG_SKY2_DEBUG is not set CONFIG_VIA_VELOCITY=m # CONFIG_TIGON3 is not set # CONFIG_BNX2 is not set # CONFIG_QLA3XXX is not set # CONFIG_ATL1 is not set # CONFIG_ATL1E is not set # CONFIG_ATL1C is not set # CONFIG_JME is not set # CONFIG_NETDEV_10000 is not set # CONFIG_TR is not set # # Wireless LAN # # CONFIG_WLAN_PRE80211 is not set # CONFIG_WLAN_80211 is not set # # Enable WiMAX (Networking options) to see the WiMAX drivers # # # USB Network Adapters # # CONFIG_USB_CATC is not set # CONFIG_USB_KAWETH is not set # CONFIG_USB_PEGASUS is not set # CONFIG_USB_RTL8150 is not set CONFIG_USB_USBNET=m CONFIG_USB_NET_AX8817X=m CONFIG_USB_NET_CDCETHER=m CONFIG_USB_NET_DM9601=m # CONFIG_USB_NET_SMSC95XX is not set CONFIG_USB_NET_GL620A=m CONFIG_USB_NET_NET1080=m # CONFIG_USB_NET_PLUSB is not set # CONFIG_USB_NET_MCS7830 is not set # CONFIG_USB_NET_RNDIS_HOST is not set CONFIG_USB_NET_CDC_SUBSET=m CONFIG_USB_ALI_M5632=y CONFIG_USB_AN2720=y CONFIG_USB_BELKIN=y CONFIG_USB_ARMLINUX=y CONFIG_USB_EPSON2888=y CONFIG_USB_KC2190=y # CONFIG_USB_NET_ZAURUS is not set CONFIG_NET_PCMCIA=y # CONFIG_PCMCIA_3C589 is not set # CONFIG_PCMCIA_3C574 is not set # CONFIG_PCMCIA_FMVJ18X is not set CONFIG_PCMCIA_PCNET=m CONFIG_PCMCIA_NMCLAN=m CONFIG_PCMCIA_SMC91C92=m # CONFIG_PCMCIA_XIRC2PS is not set # CONFIG_PCMCIA_AXNET is not set # CONFIG_WAN is not set CONFIG_FDDI=y # CONFIG_DEFXX is not set # CONFIG_SKFP is not set # CONFIG_HIPPI is not set CONFIG_PLIP=m CONFIG_PPP=m CONFIG_PPP_MULTILINK=y CONFIG_PPP_FILTER=y CONFIG_PPP_ASYNC=m CONFIG_PPP_SYNC_TTY=m CONFIG_PPP_DEFLATE=m # CONFIG_PPP_BSDCOMP is not set # CONFIG_PPP_MPPE is not set CONFIG_PPPOE=m # CONFIG_PPPOL2TP is not set CONFIG_SLIP=m CONFIG_SLIP_COMPRESSED=y CONFIG_SLHC=m CONFIG_SLIP_SMART=y # CONFIG_SLIP_MODE_SLIP6 is not set CONFIG_NET_FC=y CONFIG_NETCONSOLE=m # CONFIG_NETCONSOLE_DYNAMIC is not set CONFIG_NETPOLL=y CONFIG_NETPOLL_TRAP=y CONFIG_NET_POLL_CONTROLLER=y # CONFIG_ISDN is not set # CONFIG_PHONE is not set # # Input device support # CONFIG_INPUT=y CONFIG_INPUT_FF_MEMLESS=y CONFIG_INPUT_POLLDEV=m # # Userland interfaces # CONFIG_INPUT_MOUSEDEV=y # CONFIG_INPUT_MOUSEDEV_PSAUX is not set CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024 CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768 # CONFIG_INPUT_JOYDEV is not set CONFIG_INPUT_EVDEV=y # CONFIG_INPUT_EVBUG is not set # # Input Device Drivers # CONFIG_INPUT_KEYBOARD=y CONFIG_KEYBOARD_ATKBD=y # CONFIG_KEYBOARD_SUNKBD is not set # CONFIG_KEYBOARD_LKKBD is not set # CONFIG_KEYBOARD_XTKBD is not set # CONFIG_KEYBOARD_NEWTON is not set # CONFIG_KEYBOARD_STOWAWAY is not set CONFIG_INPUT_MOUSE=y CONFIG_MOUSE_PS2=y CONFIG_MOUSE_PS2_ALPS=y CONFIG_MOUSE_PS2_LOGIPS2PP=y CONFIG_MOUSE_PS2_SYNAPTICS=y CONFIG_MOUSE_PS2_LIFEBOOK=y CONFIG_MOUSE_PS2_TRACKPOINT=y # CONFIG_MOUSE_PS2_ELANTECH is not set # CONFIG_MOUSE_PS2_TOUCHKIT is not set CONFIG_MOUSE_SERIAL=m CONFIG_MOUSE_APPLETOUCH=m # CONFIG_MOUSE_BCM5974 is not set # CONFIG_MOUSE_INPORT is not set # CONFIG_MOUSE_LOGIBM is not set # CONFIG_MOUSE_PC110PAD is not set CONFIG_MOUSE_VSXXXAA=m # CONFIG_INPUT_JOYSTICK is not set # CONFIG_INPUT_TABLET is not set # CONFIG_INPUT_TOUCHSCREEN is not set CONFIG_INPUT_MISC=y # CONFIG_INPUT_PCSPKR is not set # CONFIG_INPUT_APANEL is not set # CONFIG_INPUT_WISTRON_BTNS is not set # CONFIG_INPUT_ATLAS_BTNS is not set # CONFIG_INPUT_ATI_REMOTE is not set # CONFIG_INPUT_ATI_REMOTE2 is not set # CONFIG_INPUT_KEYSPAN_REMOTE is not set # CONFIG_INPUT_POWERMATE is not set # CONFIG_INPUT_YEALINK is not set # CONFIG_INPUT_CM109 is not set CONFIG_INPUT_UINPUT=m # # Hardware I/O ports # CONFIG_SERIO=y CONFIG_SERIO_I8042=y CONFIG_SERIO_SERPORT=y # CONFIG_SERIO_CT82C710 is not set # CONFIG_SERIO_PARKBD is not set # CONFIG_SERIO_PCIPS2 is not set CONFIG_SERIO_LIBPS2=y CONFIG_SERIO_RAW=m # CONFIG_GAMEPORT is not set # # Character devices # CONFIG_VT=y CONFIG_CONSOLE_TRANSLATIONS=y CONFIG_VT_CONSOLE=y CONFIG_HW_CONSOLE=y CONFIG_VT_HW_CONSOLE_BINDING=y CONFIG_DEVKMEM=y CONFIG_SERIAL_NONSTANDARD=y # CONFIG_COMPUTONE is not set CONFIG_ROCKETPORT=m CONFIG_CYCLADES=m # CONFIG_CYZ_INTR is not set # CONFIG_DIGIEPCA is not set # CONFIG_MOXA_INTELLIO is not set # CONFIG_MOXA_SMARTIO is not set # CONFIG_ISI is not set # CONFIG_SYNCLINK is not set CONFIG_SYNCLINKMP=m CONFIG_SYNCLINK_GT=m # CONFIG_N_HDLC is not set # CONFIG_RISCOM8 is not set # CONFIG_SPECIALIX is not set # CONFIG_SX is not set # CONFIG_RIO is not set # CONFIG_STALDRV is not set # CONFIG_NOZOMI is not set # # Serial drivers # CONFIG_SERIAL_8250=y CONFIG_SERIAL_8250_CONSOLE=y CONFIG_FIX_EARLYCON_MEM=y CONFIG_SERIAL_8250_PCI=y CONFIG_SERIAL_8250_PNP=y CONFIG_SERIAL_8250_CS=m CONFIG_SERIAL_8250_NR_UARTS=32 CONFIG_SERIAL_8250_RUNTIME_UARTS=4 CONFIG_SERIAL_8250_EXTENDED=y CONFIG_SERIAL_8250_MANY_PORTS=y # CONFIG_SERIAL_8250_FOURPORT is not set # CONFIG_SERIAL_8250_ACCENT is not set # CONFIG_SERIAL_8250_BOCA is not set # CONFIG_SERIAL_8250_EXAR_ST16C554 is not set # CONFIG_SERIAL_8250_HUB6 is not set CONFIG_SERIAL_8250_SHARE_IRQ=y CONFIG_SERIAL_8250_DETECT_IRQ=y CONFIG_SERIAL_8250_RSA=y # # Non-8250 serial port support # CONFIG_SERIAL_CORE=y CONFIG_SERIAL_CORE_CONSOLE=y CONFIG_SERIAL_JSM=m CONFIG_UNIX98_PTYS=y # CONFIG_DEVPTS_MULTIPLE_INSTANCES is not set # CONFIG_LEGACY_PTYS is not set CONFIG_PRINTER=m CONFIG_LP_CONSOLE=y CONFIG_PPDEV=m CONFIG_IPMI_HANDLER=m # CONFIG_IPMI_PANIC_EVENT is not set CONFIG_IPMI_DEVICE_INTERFACE=m CONFIG_IPMI_SI=m CONFIG_IPMI_WATCHDOG=m CONFIG_IPMI_POWEROFF=m CONFIG_HW_RANDOM=y # CONFIG_HW_RANDOM_TIMERIOMEM is not set CONFIG_HW_RANDOM_INTEL=m CONFIG_HW_RANDOM_AMD=m CONFIG_HW_RANDOM_GEODE=m CONFIG_HW_RANDOM_VIA=m CONFIG_NVRAM=y CONFIG_RTC=y # CONFIG_DTLK is not set # CONFIG_R3964 is not set # CONFIG_APPLICOM is not set # CONFIG_SONYPI is not set # # PCMCIA character devices # # CONFIG_SYNCLINK_CS is not set CONFIG_CARDMAN_4000=m CONFIG_CARDMAN_4040=m # CONFIG_IPWIRELESS is not set CONFIG_MWAVE=m # CONFIG_PC8736x_GPIO is not set # CONFIG_NSC_GPIO is not set # CONFIG_CS5535_GPIO is not set # CONFIG_RAW_DRIVER is not set CONFIG_HPET=y # CONFIG_HPET_MMAP is not set CONFIG_HANGCHECK_TIMER=m # CONFIG_TCG_TPM is not set # CONFIG_TELCLOCK is not set CONFIG_DEVPORT=y CONFIG_I2C=m CONFIG_I2C_BOARDINFO=y CONFIG_I2C_CHARDEV=m CONFIG_I2C_HELPER_AUTO=y CONFIG_I2C_ALGOBIT=m CONFIG_I2C_ALGOPCA=m # # I2C Hardware Bus support # # # PC SMBus host controller drivers # CONFIG_I2C_ALI1535=m CONFIG_I2C_ALI1563=m CONFIG_I2C_ALI15X3=m CONFIG_I2C_AMD756=m CONFIG_I2C_AMD756_S4882=m # CONFIG_I2C_AMD8111 is not set CONFIG_I2C_I801=m # CONFIG_I2C_ISCH is not set CONFIG_I2C_PIIX4=m CONFIG_I2C_NFORCE2=m # CONFIG_I2C_NFORCE2_S4985 is not set # CONFIG_I2C_SIS5595 is not set # CONFIG_I2C_SIS630 is not set # CONFIG_I2C_SIS96X is not set CONFIG_I2C_VIA=m CONFIG_I2C_VIAPRO=m # # I2C system bus drivers (mostly embedded / system-on-chip) # # CONFIG_I2C_OCORES is not set CONFIG_I2C_SIMTEC=m # # External I2C/SMBus adapter drivers # CONFIG_I2C_PARPORT=m CONFIG_I2C_PARPORT_LIGHT=m # CONFIG_I2C_TAOS_EVM is not set # CONFIG_I2C_TINY_USB is not set # # Graphics adapter I2C/DDC channel drivers # CONFIG_I2C_VOODOO3=m # # Other I2C/SMBus bus drivers # CONFIG_I2C_PCA_ISA=m # CONFIG_I2C_PCA_PLATFORM is not set CONFIG_I2C_STUB=m # CONFIG_SCx200_ACB is not set # # Miscellaneous I2C Chip support # # CONFIG_DS1682 is not set # CONFIG_SENSORS_PCF8574 is not set # CONFIG_PCF8575 is not set # CONFIG_SENSORS_PCA9539 is not set CONFIG_SENSORS_MAX6875=m # CONFIG_SENSORS_TSL2550 is not set # CONFIG_I2C_DEBUG_CORE is not set # CONFIG_I2C_DEBUG_ALGO is not set # CONFIG_I2C_DEBUG_BUS is not set # CONFIG_I2C_DEBUG_CHIP is not set # CONFIG_SPI is not set CONFIG_ARCH_WANT_OPTIONAL_GPIOLIB=y # CONFIG_GPIOLIB is not set # CONFIG_W1 is not set CONFIG_POWER_SUPPLY=y # CONFIG_POWER_SUPPLY_DEBUG is not set # CONFIG_PDA_POWER is not set # CONFIG_BATTERY_DS2760 is not set # CONFIG_BATTERY_BQ27x00 is not set CONFIG_HWMON=m CONFIG_HWMON_VID=m # CONFIG_SENSORS_ABITUGURU is not set # CONFIG_SENSORS_ABITUGURU3 is not set # CONFIG_SENSORS_AD7414 is not set CONFIG_SENSORS_AD7418=m # CONFIG_SENSORS_ADM1021 is not set # CONFIG_SENSORS_ADM1025 is not set # CONFIG_SENSORS_ADM1026 is not set # CONFIG_SENSORS_ADM1029 is not set # CONFIG_SENSORS_ADM1031 is not set # CONFIG_SENSORS_ADM9240 is not set # CONFIG_SENSORS_ADT7462 is not set # CONFIG_SENSORS_ADT7470 is not set # CONFIG_SENSORS_ADT7473 is not set # CONFIG_SENSORS_ADT7475 is not set # CONFIG_SENSORS_K8TEMP is not set # CONFIG_SENSORS_ASB100 is not set # CONFIG_SENSORS_ATK0110 is not set # CONFIG_SENSORS_ATXP1 is not set # CONFIG_SENSORS_DS1621 is not set # CONFIG_SENSORS_I5K_AMB is not set # CONFIG_SENSORS_F71805F is not set # CONFIG_SENSORS_F71882FG is not set # CONFIG_SENSORS_F75375S is not set # CONFIG_SENSORS_FSCHER is not set # CONFIG_SENSORS_FSCPOS is not set # CONFIG_SENSORS_FSCHMD is not set # CONFIG_SENSORS_G760A is not set # CONFIG_SENSORS_GL518SM is not set # CONFIG_SENSORS_GL520SM is not set CONFIG_SENSORS_CORETEMP=m # CONFIG_SENSORS_IBMAEM is not set # CONFIG_SENSORS_IBMPEX is not set # CONFIG_SENSORS_IT87 is not set # CONFIG_SENSORS_LM63 is not set # CONFIG_SENSORS_LM75 is not set # CONFIG_SENSORS_LM77 is not set # CONFIG_SENSORS_LM78 is not set # CONFIG_SENSORS_LM80 is not set # CONFIG_SENSORS_LM83 is not set # CONFIG_SENSORS_LM85 is not set # CONFIG_SENSORS_LM87 is not set # CONFIG_SENSORS_LM90 is not set # CONFIG_SENSORS_LM92 is not set # CONFIG_SENSORS_LM93 is not set # CONFIG_SENSORS_LTC4215 is not set # CONFIG_SENSORS_LTC4245 is not set # CONFIG_SENSORS_LM95241 is not set # CONFIG_SENSORS_MAX1619 is not set # CONFIG_SENSORS_MAX6650 is not set # CONFIG_SENSORS_PC87360 is not set # CONFIG_SENSORS_PC87427 is not set # CONFIG_SENSORS_PCF8591 is not set CONFIG_SENSORS_SIS5595=m # CONFIG_SENSORS_DME1737 is not set # CONFIG_SENSORS_SMSC47M1 is not set # CONFIG_SENSORS_SMSC47M192 is not set # CONFIG_SENSORS_SMSC47B397 is not set # CONFIG_SENSORS_ADS7828 is not set # CONFIG_SENSORS_THMC50 is not set CONFIG_SENSORS_VIA686A=m CONFIG_SENSORS_VT1211=m CONFIG_SENSORS_VT8231=m # CONFIG_SENSORS_W83781D is not set # CONFIG_SENSORS_W83791D is not set # CONFIG_SENSORS_W83792D is not set # CONFIG_SENSORS_W83793 is not set # CONFIG_SENSORS_W83L785TS is not set # CONFIG_SENSORS_W83L786NG is not set # CONFIG_SENSORS_W83627HF is not set # CONFIG_SENSORS_W83627EHF is not set CONFIG_SENSORS_HDAPS=m # CONFIG_SENSORS_LIS3LV02D is not set # CONFIG_SENSORS_APPLESMC is not set # CONFIG_HWMON_DEBUG_CHIP is not set CONFIG_THERMAL=y # CONFIG_WATCHDOG is not set CONFIG_SSB_POSSIBLE=y # # Sonics Silicon Backplane # CONFIG_SSB=m CONFIG_SSB_SPROM=y CONFIG_SSB_PCIHOST_POSSIBLE=y CONFIG_SSB_PCIHOST=y # CONFIG_SSB_B43_PCI_BRIDGE is not set CONFIG_SSB_PCMCIAHOST_POSSIBLE=y CONFIG_SSB_PCMCIAHOST=y # CONFIG_SSB_DEBUG is not set CONFIG_SSB_DRIVER_PCICORE_POSSIBLE=y CONFIG_SSB_DRIVER_PCICORE=y # # Multifunction device drivers # # CONFIG_MFD_CORE is not set # CONFIG_MFD_SM501 is not set # CONFIG_HTC_PASIC3 is not set # CONFIG_MFD_TMIO is not set # CONFIG_MFD_WM8400 is not set # CONFIG_MFD_WM8350_I2C is not set # CONFIG_MFD_PCF50633 is not set # CONFIG_REGULATOR is not set # # Multimedia devices # # # Multimedia core support # CONFIG_VIDEO_DEV=m CONFIG_VIDEO_V4L2_COMMON=m CONFIG_VIDEO_ALLOW_V4L1=y CONFIG_VIDEO_V4L1_COMPAT=y # CONFIG_DVB_CORE is not set CONFIG_VIDEO_MEDIA=m # # Multimedia drivers # # CONFIG_MEDIA_ATTACH is not set CONFIG_MEDIA_TUNER=m # CONFIG_MEDIA_TUNER_CUSTOMISE is not set CONFIG_MEDIA_TUNER_SIMPLE=m CONFIG_MEDIA_TUNER_TDA8290=m CONFIG_MEDIA_TUNER_TDA9887=m CONFIG_MEDIA_TUNER_TEA5761=m CONFIG_MEDIA_TUNER_TEA5767=m CONFIG_MEDIA_TUNER_MT20XX=m CONFIG_MEDIA_TUNER_XC2028=m CONFIG_MEDIA_TUNER_XC5000=m CONFIG_MEDIA_TUNER_MC44S803=m CONFIG_VIDEO_V4L2=m CONFIG_VIDEO_V4L1=m CONFIG_VIDEOBUF_GEN=m CONFIG_VIDEOBUF_DMA_SG=m CONFIG_VIDEO_BTCX=m CONFIG_VIDEO_IR=m CONFIG_VIDEO_TVEEPROM=m CONFIG_VIDEO_TUNER=m CONFIG_VIDEO_CAPTURE_DRIVERS=y # CONFIG_VIDEO_ADV_DEBUG is not set # CONFIG_VIDEO_FIXED_MINOR_RANGES is not set # CONFIG_VIDEO_HELPER_CHIPS_AUTO is not set CONFIG_VIDEO_IR_I2C=m # # Encoders/decoders and other helper chips # # # Audio decoders # CONFIG_VIDEO_TVAUDIO=m CONFIG_VIDEO_TDA7432=m CONFIG_VIDEO_TDA9840=m CONFIG_VIDEO_TDA9875=m CONFIG_VIDEO_TEA6415C=m CONFIG_VIDEO_TEA6420=m CONFIG_VIDEO_MSP3400=m # CONFIG_VIDEO_CS5345 is not set CONFIG_VIDEO_CS53L32A=m CONFIG_VIDEO_M52790=m CONFIG_VIDEO_TLV320AIC23B=m CONFIG_VIDEO_WM8775=m CONFIG_VIDEO_WM8739=m CONFIG_VIDEO_VP27SMPX=m # # RDS decoders # # CONFIG_VIDEO_SAA6588 is not set # # Video decoders # CONFIG_VIDEO_BT819=m CONFIG_VIDEO_BT856=m CONFIG_VIDEO_BT866=m CONFIG_VIDEO_KS0127=m CONFIG_VIDEO_OV7670=m # CONFIG_VIDEO_TCM825X is not set CONFIG_VIDEO_SAA7110=m CONFIG_VIDEO_SAA711X=m CONFIG_VIDEO_SAA717X=m CONFIG_VIDEO_SAA7191=m # CONFIG_VIDEO_TVP514X is not set CONFIG_VIDEO_TVP5150=m CONFIG_VIDEO_VPX3220=m # # Video and audio decoders # CONFIG_VIDEO_CX25840=m # # MPEG video encoders # CONFIG_VIDEO_CX2341X=m # # Video encoders # CONFIG_VIDEO_SAA7127=m CONFIG_VIDEO_SAA7185=m CONFIG_VIDEO_ADV7170=m CONFIG_VIDEO_ADV7175=m # # Video improvement chips # CONFIG_VIDEO_UPD64031A=m CONFIG_VIDEO_UPD64083=m # CONFIG_VIDEO_VIVI is not set CONFIG_VIDEO_BT848=m # CONFIG_VIDEO_PMS is not set # CONFIG_VIDEO_BWQCAM is not set # CONFIG_VIDEO_CQCAM is not set # CONFIG_VIDEO_W9966 is not set CONFIG_VIDEO_CPIA=m CONFIG_VIDEO_CPIA_PP=m CONFIG_VIDEO_CPIA_USB=m CONFIG_VIDEO_CPIA2=m # CONFIG_VIDEO_SAA5246A is not set # CONFIG_VIDEO_SAA5249 is not set # CONFIG_VIDEO_STRADIS is not set CONFIG_VIDEO_ZORAN=m # CONFIG_VIDEO_ZORAN_DC30 is not set CONFIG_VIDEO_ZORAN_ZR36060=m CONFIG_VIDEO_ZORAN_BUZ=m # CONFIG_VIDEO_ZORAN_DC10 is not set CONFIG_VIDEO_ZORAN_LML33=m # CONFIG_VIDEO_ZORAN_LML33R10 is not set # CONFIG_VIDEO_ZORAN_AVS6EYES is not set # CONFIG_VIDEO_SAA7134 is not set # CONFIG_VIDEO_MXB is not set # CONFIG_VIDEO_HEXIUM_ORION is not set # CONFIG_VIDEO_HEXIUM_GEMINI is not set # CONFIG_VIDEO_CX88 is not set CONFIG_VIDEO_IVTV=m # CONFIG_VIDEO_FB_IVTV is not set # CONFIG_VIDEO_CAFE_CCIC is not set # CONFIG_SOC_CAMERA is not set # CONFIG_V4L_USB_DRIVERS is not set CONFIG_RADIO_ADAPTERS=y # CONFIG_RADIO_CADET is not set # CONFIG_RADIO_RTRACK is not set # CONFIG_RADIO_RTRACK2 is not set # CONFIG_RADIO_AZTECH is not set # CONFIG_RADIO_GEMTEK is not set # CONFIG_RADIO_GEMTEK_PCI is not set CONFIG_RADIO_MAXIRADIO=m CONFIG_RADIO_MAESTRO=m # CONFIG_RADIO_SF16FMI is not set # CONFIG_RADIO_SF16FMR2 is not set # CONFIG_RADIO_TERRATEC is not set # CONFIG_RADIO_TRUST is not set # CONFIG_RADIO_TYPHOON is not set # CONFIG_RADIO_ZOLTRIX is not set CONFIG_USB_DSBR=m # CONFIG_USB_SI470X is not set # CONFIG_USB_MR800 is not set # CONFIG_RADIO_TEA5764 is not set CONFIG_DAB=y CONFIG_USB_DABUSB=m # # Graphics support # CONFIG_AGP=y CONFIG_AGP_ALI=y CONFIG_AGP_ATI=y # CONFIG_AGP_AMD is not set # CONFIG_AGP_AMD64 is not set CONFIG_AGP_INTEL=y CONFIG_AGP_NVIDIA=y CONFIG_AGP_SIS=y # CONFIG_AGP_SWORKS is not set CONFIG_AGP_VIA=y CONFIG_AGP_EFFICEON=y CONFIG_DRM=m CONFIG_DRM_TDFX=m CONFIG_DRM_R128=m CONFIG_DRM_RADEON=m CONFIG_DRM_I810=m CONFIG_DRM_I830=m CONFIG_DRM_I915=m # CONFIG_DRM_I915_KMS is not set # CONFIG_DRM_MGA is not set CONFIG_DRM_SIS=m # CONFIG_DRM_VIA is not set # CONFIG_DRM_SAVAGE is not set CONFIG_VGASTATE=m CONFIG_VIDEO_OUTPUT_CONTROL=m CONFIG_FB=y # CONFIG_FIRMWARE_EDID is not set CONFIG_FB_DDC=m CONFIG_FB_BOOT_VESA_SUPPORT=y CONFIG_FB_CFB_FILLRECT=y CONFIG_FB_CFB_COPYAREA=y CONFIG_FB_CFB_IMAGEBLIT=y # CONFIG_FB_CFB_REV_PIXELS_IN_BYTE is not set # CONFIG_FB_SYS_FILLRECT is not set # CONFIG_FB_SYS_COPYAREA is not set # CONFIG_FB_SYS_IMAGEBLIT is not set # CONFIG_FB_FOREIGN_ENDIAN is not set # CONFIG_FB_SYS_FOPS is not set CONFIG_FB_SVGALIB=m # CONFIG_FB_MACMODES is not set CONFIG_FB_BACKLIGHT=y CONFIG_FB_MODE_HELPERS=y CONFIG_FB_TILEBLITTING=y # # Frame buffer hardware drivers # # CONFIG_FB_CIRRUS is not set # CONFIG_FB_PM2 is not set # CONFIG_FB_CYBER2000 is not set # CONFIG_FB_ARC is not set # CONFIG_FB_ASILIANT is not set # CONFIG_FB_IMSTT is not set # CONFIG_FB_VGA16 is not set CONFIG_FB_VESA=y # CONFIG_FB_EFI is not set # CONFIG_FB_N411 is not set # CONFIG_FB_HGA is not set # CONFIG_FB_S1D13XXX is not set CONFIG_FB_NVIDIA=m CONFIG_FB_NVIDIA_I2C=y # CONFIG_FB_NVIDIA_DEBUG is not set CONFIG_FB_NVIDIA_BACKLIGHT=y # CONFIG_FB_RIVA is not set # CONFIG_FB_I810 is not set # CONFIG_FB_LE80578 is not set # CONFIG_FB_INTEL is not set # CONFIG_FB_MATROX is not set CONFIG_FB_RADEON=m CONFIG_FB_RADEON_I2C=y CONFIG_FB_RADEON_BACKLIGHT=y # CONFIG_FB_RADEON_DEBUG is not set # CONFIG_FB_ATY128 is not set # CONFIG_FB_ATY is not set CONFIG_FB_S3=m CONFIG_FB_SAVAGE=m CONFIG_FB_SAVAGE_I2C=y CONFIG_FB_SAVAGE_ACCEL=y # CONFIG_FB_SIS is not set # CONFIG_FB_VIA is not set # CONFIG_FB_NEOMAGIC is not set # CONFIG_FB_KYRO is not set # CONFIG_FB_3DFX is not set # CONFIG_FB_VOODOO1 is not set # CONFIG_FB_VT8623 is not set CONFIG_FB_TRIDENT=m # CONFIG_FB_ARK is not set # CONFIG_FB_PM3 is not set # CONFIG_FB_CARMINE is not set # CONFIG_FB_GEODE is not set # CONFIG_FB_VIRTUAL is not set # CONFIG_FB_METRONOME is not set # CONFIG_FB_MB862XX is not set # CONFIG_FB_BROADSHEET is not set CONFIG_BACKLIGHT_LCD_SUPPORT=y CONFIG_LCD_CLASS_DEVICE=m # CONFIG_LCD_ILI9320 is not set # CONFIG_LCD_PLATFORM is not set CONFIG_BACKLIGHT_CLASS_DEVICE=y CONFIG_BACKLIGHT_GENERIC=y CONFIG_BACKLIGHT_PROGEAR=m # CONFIG_BACKLIGHT_MBP_NVIDIA is not set # CONFIG_BACKLIGHT_SAHARA is not set # # Display device support # CONFIG_DISPLAY_SUPPORT=m # # Display hardware drivers # # # Console display driver support # CONFIG_VGA_CONSOLE=y CONFIG_VGACON_SOFT_SCROLLBACK=y CONFIG_VGACON_SOFT_SCROLLBACK_SIZE=64 # CONFIG_MDA_CONSOLE is not set CONFIG_DUMMY_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y # CONFIG_FONTS is not set CONFIG_FONT_8x8=y CONFIG_FONT_8x16=y CONFIG_LOGO=y # CONFIG_LOGO_LINUX_MONO is not set # CONFIG_LOGO_LINUX_VGA16 is not set CONFIG_LOGO_LINUX_CLUT224=y # CONFIG_SOUND is not set # CONFIG_HID_SUPPORT is not set CONFIG_USB_SUPPORT=y CONFIG_USB_ARCH_HAS_HCD=y CONFIG_USB_ARCH_HAS_OHCI=y CONFIG_USB_ARCH_HAS_EHCI=y CONFIG_USB=y # CONFIG_USB_DEBUG is not set # CONFIG_USB_ANNOUNCE_NEW_DEVICES is not set # # Miscellaneous USB options # CONFIG_USB_DEVICEFS=y # CONFIG_USB_DEVICE_CLASS is not set # CONFIG_USB_DYNAMIC_MINORS is not set CONFIG_USB_SUSPEND=y # CONFIG_USB_OTG is not set # CONFIG_USB_MON is not set # CONFIG_USB_WUSB is not set # CONFIG_USB_WUSB_CBAF is not set # # USB Host Controller Drivers # # CONFIG_USB_C67X00_HCD is not set CONFIG_USB_EHCI_HCD=m CONFIG_USB_EHCI_ROOT_HUB_TT=y CONFIG_USB_EHCI_TT_NEWSCHED=y # CONFIG_USB_OXU210HP_HCD is not set # CONFIG_USB_ISP116X_HCD is not set # CONFIG_USB_ISP1760_HCD is not set CONFIG_USB_OHCI_HCD=m # CONFIG_USB_OHCI_HCD_SSB is not set # CONFIG_USB_OHCI_BIG_ENDIAN_DESC is not set # CONFIG_USB_OHCI_BIG_ENDIAN_MMIO is not set CONFIG_USB_OHCI_LITTLE_ENDIAN=y CONFIG_USB_UHCI_HCD=m # CONFIG_USB_U132_HCD is not set # CONFIG_USB_SL811_HCD is not set # CONFIG_USB_R8A66597_HCD is not set # CONFIG_USB_WHCI_HCD is not set # CONFIG_USB_HWA_HCD is not set # # USB Device Class drivers # # CONFIG_USB_ACM is not set # CONFIG_USB_PRINTER is not set # CONFIG_USB_WDM is not set # CONFIG_USB_TMC is not set # # NOTE: USB_STORAGE depends on SCSI but BLK_DEV_SD may # # # also be needed; see USB_STORAGE Help for more info # CONFIG_USB_STORAGE=m # CONFIG_USB_STORAGE_DEBUG is not set CONFIG_USB_STORAGE_DATAFAB=m CONFIG_USB_STORAGE_FREECOM=m # CONFIG_USB_STORAGE_ISD200 is not set CONFIG_USB_STORAGE_USBAT=m # CONFIG_USB_STORAGE_SDDR09 is not set # CONFIG_USB_STORAGE_SDDR55 is not set # CONFIG_USB_STORAGE_JUMPSHOT is not set # CONFIG_USB_STORAGE_ALAUDA is not set # CONFIG_USB_STORAGE_ONETOUCH is not set # CONFIG_USB_STORAGE_KARMA is not set # CONFIG_USB_STORAGE_CYPRESS_ATACB is not set # CONFIG_USB_LIBUSUAL is not set # # USB Imaging devices # # CONFIG_USB_MDC800 is not set # CONFIG_USB_MICROTEK is not set # # USB port drivers # # CONFIG_USB_USS720 is not set CONFIG_USB_SERIAL=m CONFIG_USB_EZUSB=y CONFIG_USB_SERIAL_GENERIC=y # CONFIG_USB_SERIAL_AIRCABLE is not set # CONFIG_USB_SERIAL_ARK3116 is not set # CONFIG_USB_SERIAL_BELKIN is not set # CONFIG_USB_SERIAL_CH341 is not set # CONFIG_USB_SERIAL_WHITEHEAT is not set # CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set # CONFIG_USB_SERIAL_CP210X is not set # CONFIG_USB_SERIAL_CYPRESS_M8 is not set CONFIG_USB_SERIAL_EMPEG=m # CONFIG_USB_SERIAL_FTDI_SIO is not set # CONFIG_USB_SERIAL_FUNSOFT is not set # CONFIG_USB_SERIAL_VISOR is not set # CONFIG_USB_SERIAL_IPAQ is not set # CONFIG_USB_SERIAL_IR is not set # CONFIG_USB_SERIAL_EDGEPORT is not set # CONFIG_USB_SERIAL_EDGEPORT_TI is not set # CONFIG_USB_SERIAL_GARMIN is not set # CONFIG_USB_SERIAL_IPW is not set # CONFIG_USB_SERIAL_IUU is not set # CONFIG_USB_SERIAL_KEYSPAN_PDA is not set CONFIG_USB_SERIAL_KEYSPAN=m # CONFIG_USB_SERIAL_KEYSPAN_MPR is not set # CONFIG_USB_SERIAL_KEYSPAN_USA28 is not set # CONFIG_USB_SERIAL_KEYSPAN_USA28X is not set # CONFIG_USB_SERIAL_KEYSPAN_USA28XA is not set # CONFIG_USB_SERIAL_KEYSPAN_USA28XB is not set # CONFIG_USB_SERIAL_KEYSPAN_USA19 is not set # CONFIG_USB_SERIAL_KEYSPAN_USA18X is not set # CONFIG_USB_SERIAL_KEYSPAN_USA19W is not set CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y CONFIG_USB_SERIAL_KEYSPAN_USA49W=y CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y # CONFIG_USB_SERIAL_KLSI is not set # CONFIG_USB_SERIAL_KOBIL_SCT is not set # CONFIG_USB_SERIAL_MCT_U232 is not set # CONFIG_USB_SERIAL_MOS7720 is not set # CONFIG_USB_SERIAL_MOS7840 is not set # CONFIG_USB_SERIAL_MOTOROLA is not set # CONFIG_USB_SERIAL_NAVMAN is not set # CONFIG_USB_SERIAL_PL2303 is not set # CONFIG_USB_SERIAL_OTI6858 is not set # CONFIG_USB_SERIAL_QUALCOMM is not set # CONFIG_USB_SERIAL_SPCP8X5 is not set # CONFIG_USB_SERIAL_HP4X is not set # CONFIG_USB_SERIAL_SAFE is not set # CONFIG_USB_SERIAL_SIEMENS_MPI is not set # CONFIG_USB_SERIAL_SIERRAWIRELESS is not set # CONFIG_USB_SERIAL_SYMBOL is not set # CONFIG_USB_SERIAL_TI is not set # CONFIG_USB_SERIAL_CYBERJACK is not set # CONFIG_USB_SERIAL_XIRCOM is not set # CONFIG_USB_SERIAL_OPTION is not set # CONFIG_USB_SERIAL_OMNINET is not set # CONFIG_USB_SERIAL_OPTICON is not set # CONFIG_USB_SERIAL_DEBUG is not set # # USB Miscellaneous drivers # # CONFIG_USB_EMI62 is not set # CONFIG_USB_EMI26 is not set # CONFIG_USB_ADUTUX is not set # CONFIG_USB_SEVSEG is not set # CONFIG_USB_RIO500 is not set # CONFIG_USB_LEGOTOWER is not set # CONFIG_USB_LCD is not set # CONFIG_USB_BERRY_CHARGE is not set # CONFIG_USB_LED is not set # CONFIG_USB_CYPRESS_CY7C63 is not set # CONFIG_USB_CYTHERM is not set # CONFIG_USB_IDMOUSE is not set CONFIG_USB_FTDI_ELAN=m # CONFIG_USB_APPLEDISPLAY is not set # CONFIG_USB_SISUSBVGA is not set # CONFIG_USB_LD is not set # CONFIG_USB_TRANCEVIBRATOR is not set # CONFIG_USB_IOWARRIOR is not set # CONFIG_USB_TEST is not set # CONFIG_USB_ISIGHTFW is not set # CONFIG_USB_VST is not set # CONFIG_USB_GADGET is not set # # OTG and related infrastructure # # CONFIG_NOP_USB_XCEIV is not set # CONFIG_UWB is not set # CONFIG_MMC is not set # CONFIG_MEMSTICK is not set CONFIG_NEW_LEDS=y CONFIG_LEDS_CLASS=y # # LED drivers # # CONFIG_LEDS_ALIX2 is not set # CONFIG_LEDS_PCA9532 is not set # CONFIG_LEDS_LP5521 is not set # CONFIG_LEDS_CLEVO_MAIL is not set # CONFIG_LEDS_PCA955X is not set # CONFIG_LEDS_BD2802 is not set # # LED Triggers # CONFIG_LEDS_TRIGGERS=y CONFIG_LEDS_TRIGGER_TIMER=m # CONFIG_LEDS_TRIGGER_HEARTBEAT is not set # CONFIG_LEDS_TRIGGER_BACKLIGHT is not set # CONFIG_LEDS_TRIGGER_DEFAULT_ON is not set # # iptables trigger is under Netfilter config (LED target) # # CONFIG_ACCESSIBILITY is not set # CONFIG_INFINIBAND is not set # CONFIG_EDAC is not set # CONFIG_RTC_CLASS is not set # CONFIG_DMADEVICES is not set # CONFIG_AUXDISPLAY is not set CONFIG_UIO=m # CONFIG_UIO_CIF is not set # CONFIG_UIO_PDRV is not set # CONFIG_UIO_PDRV_GENIRQ is not set # CONFIG_UIO_SMX is not set # CONFIG_UIO_AEC is not set # CONFIG_UIO_SERCOS3 is not set # CONFIG_STAGING is not set CONFIG_X86_PLATFORM_DEVICES=y # CONFIG_ASUS_LAPTOP is not set # CONFIG_FUJITSU_LAPTOP is not set # CONFIG_TC1100_WMI is not set # CONFIG_MSI_LAPTOP is not set # CONFIG_PANASONIC_LAPTOP is not set # CONFIG_COMPAL_LAPTOP is not set # CONFIG_THINKPAD_ACPI is not set # CONFIG_INTEL_MENLOW is not set # CONFIG_EEEPC_LAPTOP is not set # CONFIG_ACPI_WMI is not set # CONFIG_ACPI_ASUS is not set # CONFIG_ACPI_TOSHIBA is not set # # Firmware Drivers # CONFIG_EDD=m # CONFIG_EDD_OFF is not set CONFIG_FIRMWARE_MEMMAP=y CONFIG_EFI_VARS=y # CONFIG_DELL_RBU is not set # CONFIG_DCDBAS is not set CONFIG_DMIID=y # CONFIG_ISCSI_IBFT_FIND is not set # # File systems # CONFIG_EXT2_FS=m # CONFIG_EXT2_FS_XATTR is not set CONFIG_EXT2_FS_XIP=y CONFIG_EXT3_FS=m # CONFIG_EXT3_DEFAULTS_TO_ORDERED is not set CONFIG_EXT3_FS_XATTR=y CONFIG_EXT3_FS_POSIX_ACL=y CONFIG_EXT3_FS_SECURITY=y CONFIG_EXT4_FS=m CONFIG_EXT4DEV_COMPAT=y CONFIG_EXT4_FS_XATTR=y CONFIG_EXT4_FS_POSIX_ACL=y CONFIG_EXT4_FS_SECURITY=y CONFIG_FS_XIP=y CONFIG_JBD=m # CONFIG_JBD_DEBUG is not set CONFIG_JBD2=m # CONFIG_JBD2_DEBUG is not set CONFIG_FS_MBCACHE=m # CONFIG_REISERFS_FS is not set # CONFIG_JFS_FS is not set CONFIG_FS_POSIX_ACL=y CONFIG_FILE_LOCKING=y # CONFIG_XFS_FS is not set # CONFIG_GFS2_FS is not set # CONFIG_OCFS2_FS is not set # CONFIG_BTRFS_FS is not set CONFIG_DNOTIFY=y CONFIG_INOTIFY=y CONFIG_INOTIFY_USER=y # CONFIG_QUOTA is not set # CONFIG_AUTOFS_FS is not set CONFIG_AUTOFS4_FS=m CONFIG_FUSE_FS=m CONFIG_GENERIC_ACL=y # # Caches # # CONFIG_FSCACHE is not set # # CD-ROM/DVD Filesystems # CONFIG_ISO9660_FS=y CONFIG_JOLIET=y CONFIG_ZISOFS=y CONFIG_UDF_FS=y CONFIG_UDF_NLS=y # # DOS/FAT/NT Filesystems # CONFIG_FAT_FS=m CONFIG_MSDOS_FS=m CONFIG_VFAT_FS=m CONFIG_FAT_DEFAULT_CODEPAGE=437 CONFIG_FAT_DEFAULT_IOCHARSET="ascii" # CONFIG_NTFS_FS is not set # # Pseudo filesystems # CONFIG_PROC_FS=y CONFIG_PROC_KCORE=y CONFIG_PROC_VMCORE=y CONFIG_PROC_SYSCTL=y CONFIG_PROC_PAGE_MONITOR=y CONFIG_SYSFS=y CONFIG_TMPFS=y CONFIG_TMPFS_POSIX_ACL=y CONFIG_HUGETLBFS=y CONFIG_HUGETLB_PAGE=y CONFIG_CONFIGFS_FS=m CONFIG_MISC_FILESYSTEMS=y # CONFIG_ADFS_FS is not set # CONFIG_AFFS_FS is not set # CONFIG_HFS_FS is not set # CONFIG_HFSPLUS_FS is not set # CONFIG_BEFS_FS is not set # CONFIG_BFS_FS is not set # CONFIG_EFS_FS is not set CONFIG_CRAMFS=m # CONFIG_SQUASHFS is not set # CONFIG_VXFS_FS is not set # CONFIG_MINIX_FS is not set # CONFIG_OMFS_FS is not set # CONFIG_HPFS_FS is not set # CONFIG_QNX4FS_FS is not set CONFIG_ROMFS_FS=m CONFIG_ROMFS_BACKED_BY_BLOCK=y # CONFIG_ROMFS_BACKED_BY_MTD is not set # CONFIG_ROMFS_BACKED_BY_BOTH is not set CONFIG_ROMFS_ON_BLOCK=y # CONFIG_SYSV_FS is not set CONFIG_UFS_FS=m # CONFIG_UFS_FS_WRITE is not set # CONFIG_UFS_DEBUG is not set # CONFIG_NILFS2_FS is not set CONFIG_NETWORK_FILESYSTEMS=y CONFIG_NFS_FS=m CONFIG_NFS_V3=y CONFIG_NFS_V3_ACL=y CONFIG_NFS_V4=y # CONFIG_NFSD is not set CONFIG_LOCKD=m CONFIG_LOCKD_V4=y CONFIG_NFS_ACL_SUPPORT=m CONFIG_NFS_COMMON=y CONFIG_SUNRPC=m CONFIG_SUNRPC_GSS=m CONFIG_RPCSEC_GSS_KRB5=m # CONFIG_RPCSEC_GSS_SPKM3 is not set # CONFIG_SMB_FS is not set # CONFIG_CIFS is not set # CONFIG_NCP_FS is not set # CONFIG_CODA_FS is not set # CONFIG_AFS_FS is not set # # Partition Types # CONFIG_PARTITION_ADVANCED=y # CONFIG_ACORN_PARTITION is not set # CONFIG_OSF_PARTITION is not set # CONFIG_AMIGA_PARTITION is not set # CONFIG_ATARI_PARTITION is not set # CONFIG_MAC_PARTITION is not set CONFIG_MSDOS_PARTITION=y CONFIG_BSD_DISKLABEL=y # CONFIG_MINIX_SUBPARTITION is not set # CONFIG_SOLARIS_X86_PARTITION is not set # CONFIG_UNIXWARE_DISKLABEL is not set # CONFIG_LDM_PARTITION is not set # CONFIG_SGI_PARTITION is not set # CONFIG_ULTRIX_PARTITION is not set # CONFIG_SUN_PARTITION is not set # CONFIG_KARMA_PARTITION is not set CONFIG_EFI_PARTITION=y # CONFIG_SYSV68_PARTITION is not set CONFIG_NLS=y CONFIG_NLS_DEFAULT="utf8" CONFIG_NLS_CODEPAGE_437=y # CONFIG_NLS_CODEPAGE_737 is not set # CONFIG_NLS_CODEPAGE_775 is not set CONFIG_NLS_CODEPAGE_850=m CONFIG_NLS_CODEPAGE_852=m # CONFIG_NLS_CODEPAGE_855 is not set # CONFIG_NLS_CODEPAGE_857 is not set # CONFIG_NLS_CODEPAGE_860 is not set # CONFIG_NLS_CODEPAGE_861 is not set # CONFIG_NLS_CODEPAGE_862 is not set CONFIG_NLS_CODEPAGE_863=m # CONFIG_NLS_CODEPAGE_864 is not set # CONFIG_NLS_CODEPAGE_865 is not set # CONFIG_NLS_CODEPAGE_866 is not set # CONFIG_NLS_CODEPAGE_869 is not set CONFIG_NLS_CODEPAGE_936=m CONFIG_NLS_CODEPAGE_950=m CONFIG_NLS_CODEPAGE_932=m # CONFIG_NLS_CODEPAGE_949 is not set # CONFIG_NLS_CODEPAGE_874 is not set CONFIG_NLS_ISO8859_8=m CONFIG_NLS_CODEPAGE_1250=m CONFIG_NLS_CODEPAGE_1251=m CONFIG_NLS_ASCII=y # CONFIG_NLS_ISO8859_1 is not set # CONFIG_NLS_ISO8859_2 is not set # CONFIG_NLS_ISO8859_3 is not set # CONFIG_NLS_ISO8859_4 is not set # CONFIG_NLS_ISO8859_5 is not set # CONFIG_NLS_ISO8859_6 is not set # CONFIG_NLS_ISO8859_7 is not set # CONFIG_NLS_ISO8859_9 is not set # CONFIG_NLS_ISO8859_13 is not set # CONFIG_NLS_ISO8859_14 is not set # CONFIG_NLS_ISO8859_15 is not set # CONFIG_NLS_KOI8_R is not set # CONFIG_NLS_KOI8_U is not set CONFIG_NLS_UTF8=m # CONFIG_DLM is not set # # Kernel hacking # CONFIG_TRACE_IRQFLAGS_SUPPORT=y # CONFIG_PRINTK_TIME is not set # CONFIG_ENABLE_WARN_DEPRECATED is not set # CONFIG_ENABLE_MUST_CHECK is not set CONFIG_FRAME_WARN=1024 CONFIG_MAGIC_SYSRQ=y # CONFIG_UNUSED_SYMBOLS is not set CONFIG_DEBUG_FS=y CONFIG_HEADERS_CHECK=y CONFIG_DEBUG_KERNEL=y CONFIG_DEBUG_SHIRQ=y CONFIG_DETECT_SOFTLOCKUP=y # CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC is not set CONFIG_BOOTPARAM_SOFTLOCKUP_PANIC_VALUE=0 CONFIG_DETECT_HUNG_TASK=y # CONFIG_BOOTPARAM_HUNG_TASK_PANIC is not set CONFIG_BOOTPARAM_HUNG_TASK_PANIC_VALUE=0 CONFIG_SCHED_DEBUG=y CONFIG_SCHEDSTATS=y CONFIG_TIMER_STATS=y # CONFIG_DEBUG_OBJECTS is not set # CONFIG_SLUB_DEBUG_ON is not set # CONFIG_SLUB_STATS is not set CONFIG_DEBUG_PREEMPT=y # CONFIG_DEBUG_RT_MUTEXES is not set # CONFIG_RT_MUTEX_TESTER is not set CONFIG_DEBUG_SPINLOCK=y CONFIG_DEBUG_MUTEXES=y CONFIG_DEBUG_LOCK_ALLOC=y CONFIG_PROVE_LOCKING=y CONFIG_LOCKDEP=y # CONFIG_LOCK_STAT is not set CONFIG_DEBUG_LOCKDEP=y CONFIG_TRACE_IRQFLAGS=y CONFIG_DEBUG_SPINLOCK_SLEEP=y # CONFIG_DEBUG_LOCKING_API_SELFTESTS is not set CONFIG_STACKTRACE=y # CONFIG_DEBUG_KOBJECT is not set CONFIG_DEBUG_HIGHMEM=y CONFIG_DEBUG_BUGVERBOSE=y CONFIG_DEBUG_INFO=y # CONFIG_DEBUG_VM is not set # CONFIG_DEBUG_VIRTUAL is not set # CONFIG_DEBUG_WRITECOUNT is not set CONFIG_DEBUG_MEMORY_INIT=y CONFIG_DEBUG_LIST=y # CONFIG_DEBUG_SG is not set # CONFIG_DEBUG_NOTIFIERS is not set CONFIG_ARCH_WANT_FRAME_POINTERS=y CONFIG_FRAME_POINTER=y # CONFIG_BOOT_PRINTK_DELAY is not set # CONFIG_RCU_TORTURE_TEST is not set # CONFIG_KPROBES_SANITY_TEST is not set # CONFIG_BACKTRACE_SELF_TEST is not set # CONFIG_DEBUG_BLOCK_EXT_DEVT is not set # CONFIG_LKDTM is not set # CONFIG_FAULT_INJECTION is not set # CONFIG_LATENCYTOP is not set CONFIG_SYSCTL_SYSCALL_CHECK=y # CONFIG_DEBUG_PAGEALLOC is not set CONFIG_USER_STACKTRACE_SUPPORT=y CONFIG_NOP_TRACER=y CONFIG_HAVE_FTRACE_NMI_ENTER=y CONFIG_HAVE_FUNCTION_TRACER=y CONFIG_HAVE_FUNCTION_GRAPH_TRACER=y CONFIG_HAVE_FUNCTION_TRACE_MCOUNT_TEST=y CONFIG_HAVE_DYNAMIC_FTRACE=y CONFIG_HAVE_FTRACE_MCOUNT_RECORD=y CONFIG_HAVE_FTRACE_SYSCALLS=y CONFIG_TRACER_MAX_TRACE=y CONFIG_RING_BUFFER=y CONFIG_FTRACE_NMI_ENTER=y CONFIG_TRACING=y CONFIG_TRACING_SUPPORT=y # # Tracers # CONFIG_FUNCTION_TRACER=y CONFIG_FUNCTION_GRAPH_TRACER=y CONFIG_IRQSOFF_TRACER=y CONFIG_PREEMPT_TRACER=y CONFIG_SYSPROF_TRACER=y CONFIG_SCHED_TRACER=y CONFIG_CONTEXT_SWITCH_TRACER=y CONFIG_EVENT_TRACER=y CONFIG_FTRACE_SYSCALLS=y CONFIG_BOOT_TRACER=y # CONFIG_TRACE_BRANCH_PROFILING is not set CONFIG_POWER_TRACER=y CONFIG_STACK_TRACER=y # CONFIG_KMEMTRACE is not set CONFIG_WORKQUEUE_TRACER=y CONFIG_BLK_DEV_IO_TRACE=y CONFIG_DYNAMIC_FTRACE=y CONFIG_FTRACE_MCOUNT_RECORD=y # CONFIG_FTRACE_STARTUP_TEST is not set CONFIG_MMIOTRACE=y CONFIG_MMIOTRACE_TEST=m # CONFIG_PROVIDE_OHCI1394_DMA_INIT is not set # CONFIG_FIREWIRE_OHCI_REMOTE_DMA is not set # CONFIG_BUILD_DOCSRC is not set # CONFIG_DYNAMIC_DEBUG is not set # CONFIG_DMA_API_DEBUG is not set CONFIG_SAMPLES=y # CONFIG_SAMPLE_MARKERS is not set # CONFIG_SAMPLE_TRACEPOINTS is not set CONFIG_SAMPLE_KOBJECT=m CONFIG_SAMPLE_KPROBES=m CONFIG_SAMPLE_KRETPROBES=m CONFIG_HAVE_ARCH_KGDB=y # CONFIG_KGDB is not set # CONFIG_STRICT_DEVMEM is not set CONFIG_X86_VERBOSE_BOOTUP=y CONFIG_EARLY_PRINTK=y # CONFIG_EARLY_PRINTK_DBGP is not set # CONFIG_DEBUG_STACKOVERFLOW is not set # CONFIG_DEBUG_STACK_USAGE is not set # CONFIG_DEBUG_PER_CPU_MAPS is not set # CONFIG_X86_PTDUMP is not set CONFIG_DEBUG_RODATA=y # CONFIG_DEBUG_RODATA_TEST is not set # CONFIG_DEBUG_NX_TEST is not set CONFIG_4KSTACKS=y CONFIG_DOUBLEFAULT=y CONFIG_HAVE_MMIOTRACE_SUPPORT=y CONFIG_IO_DELAY_TYPE_0X80=0 CONFIG_IO_DELAY_TYPE_0XED=1 CONFIG_IO_DELAY_TYPE_UDELAY=2 CONFIG_IO_DELAY_TYPE_NONE=3 CONFIG_IO_DELAY_0X80=y # CONFIG_IO_DELAY_0XED is not set # CONFIG_IO_DELAY_UDELAY is not set # CONFIG_IO_DELAY_NONE is not set CONFIG_DEFAULT_IO_DELAY_TYPE=0 # CONFIG_DEBUG_BOOT_PARAMS is not set # CONFIG_CPA_DEBUG is not set # CONFIG_OPTIMIZE_INLINING is not set # # Security options # # CONFIG_KEYS is not set # CONFIG_SECURITY is not set # CONFIG_SECURITYFS is not set # CONFIG_SECURITY_FILE_CAPABILITIES is not set # CONFIG_IMA is not set CONFIG_CRYPTO=y # # Crypto core or helper # # CONFIG_CRYPTO_FIPS is not set CONFIG_CRYPTO_ALGAPI=y CONFIG_CRYPTO_ALGAPI2=y CONFIG_CRYPTO_AEAD2=y CONFIG_CRYPTO_BLKCIPHER=m CONFIG_CRYPTO_BLKCIPHER2=y CONFIG_CRYPTO_HASH=y CONFIG_CRYPTO_HASH2=y CONFIG_CRYPTO_RNG2=y CONFIG_CRYPTO_PCOMP=y CONFIG_CRYPTO_MANAGER=y CONFIG_CRYPTO_MANAGER2=y # CONFIG_CRYPTO_GF128MUL is not set CONFIG_CRYPTO_NULL=m CONFIG_CRYPTO_WORKQUEUE=y # CONFIG_CRYPTO_CRYPTD is not set # CONFIG_CRYPTO_AUTHENC is not set # CONFIG_CRYPTO_TEST is not set # # Authenticated Encryption with Associated Data # # CONFIG_CRYPTO_CCM is not set # CONFIG_CRYPTO_GCM is not set # CONFIG_CRYPTO_SEQIV is not set # # Block modes # CONFIG_CRYPTO_CBC=m # CONFIG_CRYPTO_CTR is not set # CONFIG_CRYPTO_CTS is not set # CONFIG_CRYPTO_ECB is not set # CONFIG_CRYPTO_LRW is not set # CONFIG_CRYPTO_PCBC is not set # CONFIG_CRYPTO_XTS is not set # # Hash modes # # CONFIG_CRYPTO_HMAC is not set # CONFIG_CRYPTO_XCBC is not set # # Digest # CONFIG_CRYPTO_CRC32C=y # CONFIG_CRYPTO_CRC32C_INTEL is not set CONFIG_CRYPTO_MD4=m CONFIG_CRYPTO_MD5=y # CONFIG_CRYPTO_MICHAEL_MIC is not set # CONFIG_CRYPTO_RMD128 is not set # CONFIG_CRYPTO_RMD160 is not set # CONFIG_CRYPTO_RMD256 is not set # CONFIG_CRYPTO_RMD320 is not set CONFIG_CRYPTO_SHA1=y CONFIG_CRYPTO_SHA256=m # CONFIG_CRYPTO_SHA512 is not set # CONFIG_CRYPTO_TGR192 is not set # CONFIG_CRYPTO_WP512 is not set # # Ciphers # CONFIG_CRYPTO_AES=m # CONFIG_CRYPTO_AES_586 is not set # CONFIG_CRYPTO_ANUBIS is not set # CONFIG_CRYPTO_ARC4 is not set # CONFIG_CRYPTO_BLOWFISH is not set # CONFIG_CRYPTO_CAMELLIA is not set # CONFIG_CRYPTO_CAST5 is not set # CONFIG_CRYPTO_CAST6 is not set CONFIG_CRYPTO_DES=m # CONFIG_CRYPTO_FCRYPT is not set # CONFIG_CRYPTO_KHAZAD is not set # CONFIG_CRYPTO_SALSA20 is not set # CONFIG_CRYPTO_SALSA20_586 is not set # CONFIG_CRYPTO_SEED is not set # CONFIG_CRYPTO_SERPENT is not set # CONFIG_CRYPTO_TEA is not set # CONFIG_CRYPTO_TWOFISH is not set # CONFIG_CRYPTO_TWOFISH_586 is not set # # Compression # # CONFIG_CRYPTO_DEFLATE is not set # CONFIG_CRYPTO_ZLIB is not set # CONFIG_CRYPTO_LZO is not set # # Random Number Generation # # CONFIG_CRYPTO_ANSI_CPRNG is not set # CONFIG_CRYPTO_HW is not set CONFIG_HAVE_KVM=y CONFIG_HAVE_KVM_IRQCHIP=y # CONFIG_VIRTUALIZATION is not set CONFIG_BINARY_PRINTF=y # # Library routines # CONFIG_BITREVERSE=y CONFIG_GENERIC_FIND_FIRST_BIT=y CONFIG_GENERIC_FIND_NEXT_BIT=y CONFIG_GENERIC_FIND_LAST_BIT=y CONFIG_CRC_CCITT=m CONFIG_CRC16=m # CONFIG_CRC_T10DIF is not set CONFIG_CRC_ITU_T=y CONFIG_CRC32=y # CONFIG_CRC7 is not set # CONFIG_LIBCRC32C is not set CONFIG_ZLIB_INFLATE=y CONFIG_ZLIB_DEFLATE=m CONFIG_DECOMPRESS_GZIP=y CONFIG_DECOMPRESS_BZIP2=y CONFIG_DECOMPRESS_LZMA=y CONFIG_HAS_IOMEM=y CONFIG_HAS_IOPORT=y CONFIG_HAS_DMA=y CONFIG_NLATTR=y [-- Attachment #3: dmesg.txt --] [-- Type: text/plain, Size: 90566 bytes --] Initializing cgroup subsys cpuset Initializing cgroup subsys cpu Linux version 2.6.30-rc4-io (root-bi+AKbBUZKY6gyzm1THtWbp2dZbC/Bob@public.gmane.org) (gcc version 4.1.2 20070925 (Red Hat 4.1.2-33)) #6 SMP PREEMPT Thu May 7 11:07:49 CST 2009 KERNEL supported cpus: Intel GenuineIntel AMD AuthenticAMD NSC Geode by NSC Cyrix CyrixInstead Centaur CentaurHauls Transmeta GenuineTMx86 Transmeta TransmetaCPU UMC UMC UMC UMC BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009f400 (usable) BIOS-e820: 000000000009f400 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003bff0000 (usable) BIOS-e820: 000000003bff0000 - 000000003bff3000 (ACPI NVS) BIOS-e820: 000000003bff3000 - 000000003c000000 (ACPI data) BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) DMI 2.3 present. Phoenix BIOS detected: BIOS may corrupt low RAM, working around it. e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved) last_pfn = 0x3bff0 max_arch_pfn = 0x100000 MTRR default type: uncachable MTRR fixed ranges enabled: 00000-9FFFF write-back A0000-BFFFF uncachable C0000-C7FFF write-protect C8000-FFFFF uncachable MTRR variable ranges enabled: 0 base 000000000 mask FC0000000 write-back 1 base 03C000000 mask FFC000000 uncachable 2 base 0D0000000 mask FF8000000 write-combining 3 disabled 4 disabled 5 disabled 6 disabled 7 disabled init_memory_mapping: 0000000000000000-00000000377fe000 0000000000 - 0000400000 page 4k 0000400000 - 0037400000 page 2M 0037400000 - 00377fe000 page 4k kernel direct mapping tables up to 377fe000 @ 10000-15000 RAMDISK: 37d0d000 - 37fefd69 Allocated new RAMDISK: 00100000 - 003e2d69 Move RAMDISK from 0000000037d0d000 - 0000000037fefd68 to 00100000 - 003e2d68 ACPI: RSDP 000f7560 00014 (v00 AWARD ) ACPI: RSDT 3bff3040 0002C (v01 AWARD AWRDACPI 42302E31 AWRD 00000000) ACPI: FACP 3bff30c0 00074 (v01 AWARD AWRDACPI 42302E31 AWRD 00000000) ACPI: DSDT 3bff3180 03ABC (v01 AWARD AWRDACPI 00001000 MSFT 0100000E) ACPI: FACS 3bff0000 00040 ACPI: APIC 3bff6c80 00084 (v01 AWARD AWRDACPI 42302E31 AWRD 00000000) ACPI: Local APIC address 0xfee00000 71MB HIGHMEM available. 887MB LOWMEM available. mapped low ram: 0 - 377fe000 low ram: 0 - 377fe000 node 0 low ram: 00000000 - 377fe000 node 0 bootmap 00011000 - 00017f00 (9 early reservations) ==> bootmem [0000000000 - 00377fe000] #0 [0000000000 - 0000001000] BIOS data page ==> [0000000000 - 0000001000] #1 [0000001000 - 0000002000] EX TRAMPOLINE ==> [0000001000 - 0000002000] #2 [0000006000 - 0000007000] TRAMPOLINE ==> [0000006000 - 0000007000] #3 [0000400000 - 0000c6bd1c] TEXT DATA BSS ==> [0000400000 - 0000c6bd1c] #4 [000009f400 - 0000100000] BIOS reserved ==> [000009f400 - 0000100000] #5 [0000c6c000 - 0000c700ed] BRK ==> [0000c6c000 - 0000c700ed] #6 [0000010000 - 0000011000] PGTABLE ==> [0000010000 - 0000011000] #7 [0000100000 - 00003e2d69] NEW RAMDISK ==> [0000100000 - 00003e2d69] #8 [0000011000 - 0000018000] BOOTMAP ==> [0000011000 - 0000018000] found SMP MP-table at [c00f5ad0] f5ad0 Zone PFN ranges: DMA 0x00000010 -> 0x00001000 Normal 0x00001000 -> 0x000377fe HighMem 0x000377fe -> 0x0003bff0 Movable zone start PFN for each node early_node_map[2] active PFN ranges 0: 0x00000010 -> 0x0000009f 0: 0x00000100 -> 0x0003bff0 On node 0 totalpages: 245631 free_area_init_node: node 0, pgdat c0778f80, node_mem_map c1000340 DMA zone: 52 pages used for memmap DMA zone: 0 pages reserved DMA zone: 3931 pages, LIFO batch:0 Normal zone: 2834 pages used for memmap Normal zone: 220396 pages, LIFO batch:31 HighMem zone: 234 pages used for memmap HighMem zone: 18184 pages, LIFO batch:3 Using APIC driver default ACPI: PM-Timer IO Port: 0x1008 ACPI: Local APIC address 0xfee00000 ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled) ACPI: LAPIC (acpi_id[0x01] lapic_id[0x01] enabled) ACPI: LAPIC (acpi_id[0x02] lapic_id[0x02] disabled) ACPI: LAPIC (acpi_id[0x03] lapic_id[0x03] disabled) ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1]) ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1]) ACPI: IOAPIC (id[0x04] address[0xfec00000] gsi_base[0]) IOAPIC[0]: apic_id 4, version 17, address 0xfec00000, GSI 0-23 ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl) ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 dfl dfl) ACPI: IRQ0 used by override. ACPI: IRQ2 used by override. ACPI: IRQ9 used by override. Enabling APIC mode: Flat. Using 1 I/O APICs Using ACPI (MADT) for SMP configuration information SMP: Allowing 4 CPUs, 2 hotplug CPUs nr_irqs_gsi: 24 Allocating PCI resources starting at 40000000 (gap: 3c000000:c2c00000) NR_CPUS:8 nr_cpumask_bits:8 nr_cpu_ids:4 nr_node_ids:1 PERCPU: Embedded 13 pages at c1c3b000, static data 32756 bytes Built 1 zonelists in Zone order, mobility grouping on. Total pages: 242511 Kernel command line: ro root=LABEL=/ rhgb quiet Enabling fast FPU save and restore... done. Enabling unmasked SIMD FPU exception support... done. Initializing CPU#0 Preemptible RCU implementation. NR_IRQS:512 CPU 0 irqstacks, hard=c1c3b000 soft=c1c3c000 PID hash table entries: 4096 (order: 12, 16384 bytes) Fast TSC calibration using PIT Detected 2800.222 MHz processor. Console: colour VGA+ 80x25 console [tty0] enabled Lock dependency validator: Copyright (c) 2006 Red Hat, Inc., Ingo Molnar ... MAX_LOCKDEP_SUBCLASSES: 8 ... MAX_LOCK_DEPTH: 48 ... MAX_LOCKDEP_KEYS: 8191 ... CLASSHASH_SIZE: 4096 ... MAX_LOCKDEP_ENTRIES: 8192 ... MAX_LOCKDEP_CHAINS: 16384 ... CHAINHASH_SIZE: 8192 memory used by lock dependency info: 2847 kB per task-struct memory footprint: 1152 bytes Dentry cache hash table entries: 131072 (order: 7, 524288 bytes) Inode-cache hash table entries: 65536 (order: 6, 262144 bytes) allocated 4914560 bytes of page_cgroup please try cgroup_disable=memory,blkio option if you don't want Initializing HighMem for node 0 (000377fe:0003bff0) Memory: 952284k/982976k available (2258k kernel code, 30016k reserved, 1424k data, 320k init, 73672k highmem) virtual kernel memory layout: fixmap : 0xffedf000 - 0xfffff000 (1152 kB) pkmap : 0xff800000 - 0xffc00000 (4096 kB) vmalloc : 0xf7ffe000 - 0xff7fe000 ( 120 MB) lowmem : 0xc0000000 - 0xf77fe000 ( 887 MB) .init : 0xc079d000 - 0xc07ed000 ( 320 kB) .data : 0xc06349ab - 0xc0798cb8 (1424 kB) .text : 0xc0400000 - 0xc06349ab (2258 kB) Checking if this processor honours the WP bit even in supervisor mode...Ok. SLUB: Genslabs=13, HWalign=128, Order=0-3, MinObjects=0, CPUs=4, Nodes=1 Calibrating delay loop (skipped), value calculated using timer frequency.. 5600.44 BogoMIPS (lpj=2800222) Mount-cache hash table entries: 512 Initializing cgroup subsys debug Initializing cgroup subsys ns Initializing cgroup subsys cpuacct Initializing cgroup subsys memory Initializing cgroup subsys blkio Initializing cgroup subsys devices Initializing cgroup subsys freezer Initializing cgroup subsys net_cls Initializing cgroup subsys io CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 1024K CPU: Physical Processor ID: 0 CPU: Processor Core ID: 0 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#0. CPU0: Intel P4/Xeon Extended MCE MSRs (24) available using mwait in idle threads. Checking 'hlt' instruction... OK. ACPI: Core revision 20090320 ftrace: converting mcount calls to 0f 1f 44 00 00 ftrace: allocating 12136 entries in 24 pages ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1 CPU0: Intel(R) Pentium(R) D CPU 2.80GHz stepping 04 lockdep: fixing up alternatives. CPU 1 irqstacks, hard=c1c4b000 soft=c1c4c000 Booting processor 1 APIC 0x1 ip 0x6000 Initializing CPU#1 Calibrating delay using timer specific routine.. 5599.23 BogoMIPS (lpj=2799617) CPU: Trace cache: 12K uops, L1 D cache: 16K CPU: L2 cache: 1024K CPU: Physical Processor ID: 0 CPU: Processor Core ID: 1 Intel machine check architecture supported. Intel machine check reporting enabled on CPU#1. CPU1: Intel P4/Xeon Extended MCE MSRs (24) available CPU1: Intel(R) Pentium(R) D CPU 2.80GHz stepping 04 checking TSC synchronization [CPU#0 -> CPU#1]: passed. Brought up 2 CPUs Total of 2 processors activated (11199.67 BogoMIPS). CPU0 attaching sched-domain: domain 0: span 0-1 level CPU groups: 0 1 CPU1 attaching sched-domain: domain 0: span 0-1 level CPU groups: 1 0 net_namespace: 436 bytes NET: Registered protocol family 16 ACPI: bus type pci registered PCI: PCI BIOS revision 2.10 entry at 0xfbda0, last bus=1 PCI: Using configuration type 1 for base access mtrr: your CPUs had inconsistent fixed MTRR settings mtrr: probably your BIOS does not setup all CPUs. mtrr: corrected configuration. bio: create slab <bio-0> at 0 ACPI: EC: Look up EC in DSDT ACPI: Interpreter enabled ACPI: (supports S0 S3 S5) ACPI: Using IOAPIC for interrupt routing ACPI: No dock devices found. ACPI: PCI Root Bridge [PCI0] (0000:00) pci 0000:00:00.0: reg 10 32bit mmio: [0xd0000000-0xd7ffffff] pci 0000:00:02.5: reg 10 io port: [0x1f0-0x1f7] pci 0000:00:02.5: reg 14 io port: [0x3f4-0x3f7] pci 0000:00:02.5: reg 18 io port: [0x170-0x177] pci 0000:00:02.5: reg 1c io port: [0x374-0x377] pci 0000:00:02.5: reg 20 io port: [0x4000-0x400f] pci 0000:00:02.5: PME# supported from D3cold pci 0000:00:02.5: PME# disabled pci 0000:00:02.7: reg 10 io port: [0xd000-0xd0ff] pci 0000:00:02.7: reg 14 io port: [0xd400-0xd47f] pci 0000:00:02.7: supports D1 D2 pci 0000:00:02.7: PME# supported from D3hot D3cold pci 0000:00:02.7: PME# disabled pci 0000:00:03.0: reg 10 32bit mmio: [0xe1104000-0xe1104fff] pci 0000:00:03.1: reg 10 32bit mmio: [0xe1100000-0xe1100fff] pci 0000:00:03.2: reg 10 32bit mmio: [0xe1101000-0xe1101fff] pci 0000:00:03.3: reg 10 32bit mmio: [0xe1102000-0xe1102fff] pci 0000:00:03.3: PME# supported from D0 D3hot D3cold pci 0000:00:03.3: PME# disabled pci 0000:00:05.0: reg 10 io port: [0xd800-0xd807] pci 0000:00:05.0: reg 14 io port: [0xdc00-0xdc03] pci 0000:00:05.0: reg 18 io port: [0xe000-0xe007] pci 0000:00:05.0: reg 1c io port: [0xe400-0xe403] pci 0000:00:05.0: reg 20 io port: [0xe800-0xe80f] pci 0000:00:05.0: PME# supported from D3cold pci 0000:00:05.0: PME# disabled pci 0000:00:0e.0: reg 10 io port: [0xec00-0xecff] pci 0000:00:0e.0: reg 14 32bit mmio: [0xe1103000-0xe11030ff] pci 0000:00:0e.0: reg 30 32bit mmio: [0x000000-0x01ffff] pci 0000:00:0e.0: supports D1 D2 pci 0000:00:0e.0: PME# supported from D1 D2 D3hot D3cold pci 0000:00:0e.0: PME# disabled pci 0000:01:00.0: reg 10 32bit mmio: [0xd8000000-0xdfffffff] pci 0000:01:00.0: reg 14 32bit mmio: [0xe1000000-0xe101ffff] pci 0000:01:00.0: reg 18 io port: [0xc000-0xc07f] pci 0000:01:00.0: supports D1 D2 pci 0000:00:01.0: bridge io port: [0xc000-0xcfff] pci 0000:00:01.0: bridge 32bit mmio: [0xe1000000-0xe10fffff] pci 0000:00:01.0: bridge 32bit mmio pref: [0xd8000000-0xdfffffff] pci_bus 0000:00: on NUMA node 0 ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT] ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 9 10 11 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 9 10 *11 14 15) ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 5 6 7 9 *10 11 14 15) ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 9 10 11 14 15) *0, disabled. ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 9 10 *11 14 15) ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 *6 7 9 10 11 14 15) ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 *9 10 11 14 15) ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 *5 6 7 9 10 11 14 15) usbcore: registered new interface driver usbfs usbcore: registered new interface driver hub usbcore: registered new device driver usb PCI: Using ACPI for IRQ routing pnp: PnP ACPI init ACPI: bus type pnp registered pnp: PnP ACPI: found 12 devices ACPI: ACPI bus type pnp unregistered system 00:00: iomem range 0xc8000-0xcbfff has been reserved system 00:00: iomem range 0xf0000-0xf7fff could not be reserved system 00:00: iomem range 0xf8000-0xfbfff could not be reserved system 00:00: iomem range 0xfc000-0xfffff could not be reserved system 00:00: iomem range 0x3bff0000-0x3bffffff could not be reserved system 00:00: iomem range 0xffff0000-0xffffffff has been reserved system 00:00: iomem range 0x0-0x9ffff could not be reserved system 00:00: iomem range 0x100000-0x3bfeffff could not be reserved system 00:00: iomem range 0xffee0000-0xffefffff has been reserved system 00:00: iomem range 0xfffe0000-0xfffeffff has been reserved system 00:00: iomem range 0xfec00000-0xfecfffff has been reserved system 00:00: iomem range 0xfee00000-0xfeefffff has been reserved system 00:02: ioport range 0x4d0-0x4d1 has been reserved system 00:02: ioport range 0x800-0x805 has been reserved system 00:02: ioport range 0x290-0x297 has been reserved system 00:02: ioport range 0x880-0x88f has been reserved pci 0000:00:01.0: PCI bridge, secondary bus 0000:01 pci 0000:00:01.0: IO window: 0xc000-0xcfff pci 0000:00:01.0: MEM window: 0xe1000000-0xe10fffff pci 0000:00:01.0: PREFETCH window: 0x000000d8000000-0x000000dfffffff pci_bus 0000:00: resource 0 io: [0x00-0xffff] pci_bus 0000:00: resource 1 mem: [0x000000-0xffffffff] pci_bus 0000:01: resource 0 io: [0xc000-0xcfff] pci_bus 0000:01: resource 1 mem: [0xe1000000-0xe10fffff] pci_bus 0000:01: resource 2 pref mem [0xd8000000-0xdfffffff] NET: Registered protocol family 2 IP route cache hash table entries: 32768 (order: 5, 131072 bytes) TCP established hash table entries: 131072 (order: 8, 1048576 bytes) TCP bind hash table entries: 65536 (order: 9, 2097152 bytes) TCP: Hash tables configured (established 131072 bind 65536) TCP reno registered NET: Registered protocol family 1 checking if image is initramfs... rootfs image is initramfs; unpacking... Freeing initrd memory: 2955k freed apm: BIOS version 1.2 Flags 0x07 (Driver version 1.16ac) apm: disabled - APM is not SMP safe. highmem bounce pool size: 64 pages HugeTLB registered 4 MB page size, pre-allocated 0 pages msgmni has been set to 1722 alg: No test for stdrng (krng) Block layer SCSI generic (bsg) driver version 0.4 loaded (major 254) io scheduler noop registered io scheduler cfq registered (default) pci 0000:01:00.0: Boot video device pci_hotplug: PCI Hot Plug PCI Core version: 0.5 fan PNP0C0B:00: registered as cooling_device0 ACPI: Fan [FAN] (on) processor ACPI_CPU:00: registered as cooling_device1 processor ACPI_CPU:01: registered as cooling_device2 thermal LNXTHERM:01: registered as thermal_zone0 ACPI: Thermal Zone [THRM] (62 C) isapnp: Scanning for PnP cards... Switched to high resolution mode on CPU 1 Switched to high resolution mode on CPU 0 isapnp: No Plug & Play device found Real Time Clock Driver v1.12b Non-volatile memory driver v1.3 Linux agpgart interface v0.103 agpgart-sis 0000:00:00.0: SiS chipset [1039/0661] agpgart-sis 0000:00:00.0: AGP aperture is 128M @ 0xd0000000 Serial: 8250/16550 driver, 4 ports, IRQ sharing enabled serial8250: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A serial8250: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A 00:07: ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A 00:08: ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A brd: module loaded PNP: PS/2 Controller [PNP0303:PS2K,PNP0f13:PS2M] at 0x60,0x64 irq 1,12 serio: i8042 KBD port at 0x60,0x64 irq 1 serio: i8042 AUX port at 0x60,0x64 irq 12 mice: PS/2 mouse device common for all mice cpuidle: using governor ladder cpuidle: using governor menu TCP cubic registered NET: Registered protocol family 17 Using IPI No-Shortcut mode registered taskstats version 1 Freeing unused kernel memory: 320k freed Write protecting the kernel text: 2260k Write protecting the kernel read-only data: 1120k ehci_hcd: USB 2.0 'Enhanced' Host Controller (EHCI) Driver ehci_hcd 0000:00:03.3: PCI INT D -> GSI 23 (level, low) -> IRQ 23 ehci_hcd 0000:00:03.3: EHCI Host Controller ehci_hcd 0000:00:03.3: new USB bus registered, assigned bus number 1 ehci_hcd 0000:00:03.3: cache line size of 128 is not supported ehci_hcd 0000:00:03.3: irq 23, io mem 0xe1102000 ehci_hcd 0000:00:03.3: USB 2.0 started, EHCI 1.00 usb usb1: configuration #1 chosen from 1 choice hub 1-0:1.0: USB hub found hub 1-0:1.0: 8 ports detected ohci_hcd: USB 1.1 'Open' Host Controller (OHCI) Driver ohci_hcd 0000:00:03.0: PCI INT A -> GSI 20 (level, low) -> IRQ 20 ohci_hcd 0000:00:03.0: OHCI Host Controller ohci_hcd 0000:00:03.0: new USB bus registered, assigned bus number 2 ohci_hcd 0000:00:03.0: irq 20, io mem 0xe1104000 usb usb2: configuration #1 chosen from 1 choice hub 2-0:1.0: USB hub found hub 2-0:1.0: 3 ports detected ohci_hcd 0000:00:03.1: PCI INT B -> GSI 21 (level, low) -> IRQ 21 ohci_hcd 0000:00:03.1: OHCI Host Controller ohci_hcd 0000:00:03.1: new USB bus registered, assigned bus number 3 ohci_hcd 0000:00:03.1: irq 21, io mem 0xe1100000 usb usb3: configuration #1 chosen from 1 choice hub 3-0:1.0: USB hub found hub 3-0:1.0: 3 ports detected ohci_hcd 0000:00:03.2: PCI INT C -> GSI 22 (level, low) -> IRQ 22 ohci_hcd 0000:00:03.2: OHCI Host Controller ohci_hcd 0000:00:03.2: new USB bus registered, assigned bus number 4 ohci_hcd 0000:00:03.2: irq 22, io mem 0xe1101000 usb usb4: configuration #1 chosen from 1 choice hub 4-0:1.0: USB hub found hub 4-0:1.0: 2 ports detected uhci_hcd: USB Universal Host Controller Interface driver SCSI subsystem initialized Driver 'sd' needs updating - please use bus_type methods libata version 3.00 loaded. pata_sis 0000:00:02.5: version 0.5.2 pata_sis 0000:00:02.5: PCI INT A -> GSI 16 (level, low) -> IRQ 16 scsi0 : pata_sis scsi1 : pata_sis ata1: PATA max UDMA/133 cmd 0x1f0 ctl 0x3f6 bmdma 0x4000 irq 14 ata2: PATA max UDMA/133 cmd 0x170 ctl 0x376 bmdma 0x4008 irq 15 input: ImPS/2 Logitech Wheel Mouse as /class/input/input0 input: AT Translated Set 2 keyboard as /class/input/input1 sata_sis 0000:00:05.0: version 1.0 sata_sis 0000:00:05.0: PCI INT A -> GSI 17 (level, low) -> IRQ 17 sata_sis 0000:00:05.0: Detected SiS 180/181/964 chipset in SATA mode scsi2 : sata_sis scsi3 : sata_sis ata3: SATA max UDMA/133 cmd 0xd800 ctl 0xdc00 bmdma 0xe800 irq 17 ata4: SATA max UDMA/133 cmd 0xe000 ctl 0xe400 bmdma 0xe808 irq 17 ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300) ata3.00: ATA-7: ST3808110AS, 3.AAE, max UDMA/133 ata3.00: 156301488 sectors, multi 16: LBA48 NCQ (depth 0/32) ata3.00: configured for UDMA/133 scsi 2:0:0:0: Direct-Access ATA ST3808110AS 3.AA PQ: 0 ANSI: 5 sd 2:0:0:0: [sda] 156301488 512-byte hardware sectors: (80.0 GB/74.5 GiB) sd 2:0:0:0: [sda] Write Protect is off sd 2:0:0:0: [sda] Mode Sense: 00 3a 00 00 sd 2:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA sda: sda1 sda2 < sda5 sda6 sda7 sda8 sda9 > sd 2:0:0:0: [sda] Attached SCSI disk ata4: SATA link down (SStatus 0 SControl 300) EXT3-fs: INFO: recovery required on readonly filesystem. EXT3-fs: write access will be enabled during recovery. kjournald starting. Commit interval 5 seconds EXT3-fs: sda8: orphan cleanup on readonly fs ext3_orphan_cleanup: deleting unreferenced inode 3725366 ext3_orphan_cleanup: deleting unreferenced inode 3725365 ext3_orphan_cleanup: deleting unreferenced inode 3725364 EXT3-fs: sda8: 3 orphan inodes deleted EXT3-fs: recovery complete. EXT3-fs: mounted filesystem with writeback data mode. r8169 Gigabit Ethernet driver 2.3LK-NAPI loaded r8169 0000:00:0e.0: PCI INT A -> GSI 18 (level, low) -> IRQ 18 r8169 0000:00:0e.0: no PCI Express capability eth0: RTL8110s at 0xf8236000, 00:16:ec:2e:b7:e0, XID 04000000 IRQ 18 sd 2:0:0:0: Attached scsi generic sg0 type 0 parport_pc 00:09: reported by Plug and Play ACPI parport0: PC-style at 0x378 (0x778), irq 7 [PCSPP,TRISTATE] input: Power Button as /class/input/input2 ACPI: Power Button [PWRF] input: Power Button as /class/input/input3 ACPI: Power Button [PWRB] input: Sleep Button as /class/input/input4 ACPI: Sleep Button [FUTS] ramfs: bad mount option: maxsize=512 EXT3 FS on sda8, internal journal kjournald starting. Commit interval 5 seconds EXT3 FS on sda7, internal journal EXT3-fs: mounted filesystem with writeback data mode. Adding 1052216k swap on /dev/sda6. Priority:-1 extents:1 across:1052216k warning: process `kudzu' used the deprecated sysctl system call with 1.23. kudzu[1133] general protection ip:8056968 sp:bffe9e90 error:0 r8169: eth0: link up r8169: eth0: link up warning: `dbus-daemon' uses 32-bit capabilities (legacy support in use) CPU0 attaching NULL sched-domain. CPU1 attaching NULL sched-domain. CPU0 attaching sched-domain: domain 0: span 0-1 level CPU groups: 0 1 CPU1 attaching sched-domain: domain 0: span 0-1 level CPU groups: 1 0 ========================================================= [ INFO: possible irq lock inversion dependency detected ] 2.6.30-rc4-io #6 --------------------------------------------------------- rmdir/2186 just changed the state of lock: (&iocg->lock){+.+...}, at: [<c0513b18>] iocg_destroy+0x2a/0x118 but this lock was taken by another, SOFTIRQ-safe lock in the past: (&q->__queue_lock){..-...} and interrupts could create inverse lock ordering between them. other info that might help us debug this: 3 locks held by rmdir/2186: #0: (&sb->s_type->i_mutex_key#10/1){+.+.+.}, at: [<c04ae1e8>] do_rmdir+0x5c/0xc8 #1: (cgroup_mutex){+.+.+.}, at: [<c045a15b>] cgroup_diput+0x3c/0xa7 #2: (&iocg->lock){+.+...}, at: [<c0513b18>] iocg_destroy+0x2a/0x118 the first lock's dependencies: -> (&iocg->lock){+.+...} ops: 3 { HARDIRQ-ON-W at: [<c044b840>] mark_held_locks+0x3d/0x58 [<c044b963>] trace_hardirqs_on_caller+0x108/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff SOFTIRQ-ON-W at: [<c044b840>] mark_held_locks+0x3d/0x58 [<c044b97c>] trace_hardirqs_on_caller+0x121/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c06304ea>] _spin_lock_irq+0x30/0x3f [<c05119bd>] io_alloc_root_group+0x104/0x155 [<c05133cb>] elv_init_fq_data+0x32/0xe0 [<c0504317>] elevator_alloc+0x150/0x170 [<c0505393>] elevator_init+0x9d/0x100 [<c0507088>] blk_init_queue_node+0xc4/0xf7 [<c05070cb>] blk_init_queue+0x10/0x12 [<f81060fd>] __scsi_alloc_queue+0x1c/0xba [scsi_mod] [<f81061b0>] scsi_alloc_queue+0x15/0x4e [scsi_mod] [<f810803d>] scsi_alloc_sdev+0x154/0x1f5 [scsi_mod] [<f8108387>] scsi_probe_and_add_lun+0x123/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c5ebd8>] __key.29462+0x0/0x8 the second lock's dependencies: -> (&q->__queue_lock){..-...} ops: 162810 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<f810672c>] scsi_device_unbusy+0x78/0x92 [scsi_mod] [<f8101483>] scsi_finish_command+0x22/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<f8101337>] scsi_adjust_queue_depth+0x2a/0xc9 [scsi_mod] [<f8108079>] scsi_alloc_sdev+0x190/0x1f5 [scsi_mod] [<f8108387>] scsi_probe_and_add_lun+0x123/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c5e698>] __key.29749+0x0/0x8 -> (&ioc->lock){..-...} ops: 1032 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c050f0f0>] cic_free_func+0x26/0x64 [<c050ea90>] __call_for_each_cic+0x23/0x2e [<c050eaad>] cfq_free_io_context+0x12/0x14 [<c050978c>] put_io_context+0x4b/0x66 [<c050f2a2>] cfq_put_request+0x42/0x5b [<c0504629>] elv_put_request+0x30/0x33 [<c050678d>] __blk_put_request+0x8b/0xb8 [<c0506953>] end_that_request_last+0x199/0x1a1 [<c0506a0d>] blk_end_io+0x51/0x6f [<c0506a64>] blk_end_request+0x11/0x13 [<f8106c9c>] scsi_io_completion+0x1d9/0x41f [scsi_mod] [<f810152d>] scsi_finish_command+0xcc/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c050f9bf>] cfq_set_request+0x123/0x33d [<c05052e6>] elv_set_request+0x43/0x53 [<c0506d44>] get_request+0x22e/0x33f [<c0507498>] get_request_wait+0x137/0x15d [<c0507501>] blk_get_request+0x43/0x73 [<f8106854>] scsi_execute+0x24/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c5e6ec>] __key.27747+0x0/0x8 -> (&rdp->lock){-.-...} ops: 168014 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0461b2a>] rcu_check_callbacks+0x6a/0xa3 [<c043549a>] update_process_times+0x3d/0x53 [<c0447fe0>] tick_periodic+0x6b/0x77 [<c0448009>] tick_handle_periodic+0x1d/0x60 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c042fbd7>] do_exit+0x53e/0x5b3 [<c043a9d8>] __request_module+0x0/0x100 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c04619db>] rcu_process_callbacks+0x2b/0x86 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c062c8ca>] rcu_online_cpu+0x3d/0x51 [<c062c910>] rcu_cpu_notify+0x32/0x43 [<c07b097f>] __rcu_init+0xf0/0x120 [<c07af027>] rcu_init+0x8/0x14 [<c079d6e1>] start_kernel+0x187/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0c2e52c>] __key.17543+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c046143d>] call_rcu+0x36/0x5b [<c0517b45>] radix_tree_delete+0xe7/0x176 [<c050f0fe>] cic_free_func+0x34/0x64 [<c050ea90>] __call_for_each_cic+0x23/0x2e [<c050eaad>] cfq_free_io_context+0x12/0x14 [<c050978c>] put_io_context+0x4b/0x66 [<c050984c>] exit_io_context+0x77/0x7b [<c042fc24>] do_exit+0x58b/0x5b3 [<c04034ed>] kernel_thread_helper+0xd/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c050f4a3>] cfq_cic_lookup+0xd9/0xef [<c050f674>] cfq_get_queue+0x92/0x2ba [<c050fb01>] cfq_set_request+0x265/0x33d [<c05052e6>] elv_set_request+0x43/0x53 [<c0506d44>] get_request+0x22e/0x33f [<c0507498>] get_request_wait+0x137/0x15d [<c0507501>] blk_get_request+0x43/0x73 [<f8106854>] scsi_execute+0x24/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&base->lock){..-...} ops: 348073 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c06304ea>] _spin_lock_irq+0x30/0x3f [<c0434b8b>] run_timer_softirq+0x3c/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0434e84>] lock_timer_base+0x24/0x43 [<c0434f3d>] mod_timer+0x46/0xcc [<c07bd97a>] con_init+0xa4/0x20e [<c07bd3b2>] console_init+0x12/0x20 [<c079d735>] start_kernel+0x1db/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c082304c>] __key.23401+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0434e84>] lock_timer_base+0x24/0x43 [<c0434f3d>] mod_timer+0x46/0xcc [<c05075cb>] blk_plug_device+0x9a/0xdf [<c05049e1>] __elv_add_request+0x86/0x96 [<c0509d52>] blk_execute_rq_nowait+0x5d/0x86 [<c0509e2e>] blk_execute_rq+0xb3/0xd5 [<f81068f5>] scsi_execute+0xc5/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&sdev->list_lock){..-...} ops: 27612 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<f8101cb4>] scsi_put_command+0x17/0x57 [scsi_mod] [<f810620f>] scsi_next_command+0x26/0x39 [scsi_mod] [<f8106d02>] scsi_io_completion+0x23f/0x41f [scsi_mod] [<f810152d>] scsi_finish_command+0xcc/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<f8101c64>] scsi_get_command+0x5c/0x95 [scsi_mod] [<f81062b6>] scsi_get_cmd_from_req+0x26/0x50 [scsi_mod] [<f8106594>] scsi_setup_blk_pc_cmnd+0x2b/0xd7 [scsi_mod] [<f8106664>] scsi_prep_fn+0x24/0x33 [scsi_mod] [<c0504712>] elv_next_request+0xe6/0x18d [<f810704c>] scsi_request_fn+0x69/0x431 [scsi_mod] [<c05072af>] __generic_unplug_device+0x2e/0x31 [<c0509d59>] blk_execute_rq_nowait+0x64/0x86 [<c0509e2e>] blk_execute_rq+0xb3/0xd5 [<f81068f5>] scsi_execute+0xc5/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<f811916c>] __key.29786+0x0/0xffff2ebf [scsi_mod] ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<f8101c64>] scsi_get_command+0x5c/0x95 [scsi_mod] [<f81062b6>] scsi_get_cmd_from_req+0x26/0x50 [scsi_mod] [<f8106594>] scsi_setup_blk_pc_cmnd+0x2b/0xd7 [scsi_mod] [<f8106664>] scsi_prep_fn+0x24/0x33 [scsi_mod] [<c0504712>] elv_next_request+0xe6/0x18d [<f810704c>] scsi_request_fn+0x69/0x431 [scsi_mod] [<c05072af>] __generic_unplug_device+0x2e/0x31 [<c0509d59>] blk_execute_rq_nowait+0x64/0x86 [<c0509e2e>] blk_execute_rq+0xb3/0xd5 [<f81068f5>] scsi_execute+0xc5/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f81084f8>] scsi_probe_and_add_lun+0x294/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&q->lock){-.-.-.} ops: 2105038 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c041ec0d>] complete+0x17/0x43 [<c062609b>] i8042_aux_test_irq+0x4c/0x65 [<c045e922>] handle_IRQ_event+0xa4/0x169 [<c04602ea>] handle_edge_irq+0xc9/0x10a [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c041ec0d>] complete+0x17/0x43 [<c043c336>] wakeme_after_rcu+0x10/0x12 [<c0461a12>] rcu_process_callbacks+0x62/0x86 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff IN-RECLAIM_FS-W at: [<c044dabd>] __lock_acquire+0x574/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043e47b>] prepare_to_wait+0x1c/0x4a [<c0485d3e>] kswapd+0xa7/0x51b [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c06304ea>] _spin_lock_irq+0x30/0x3f [<c062d811>] wait_for_common+0x2f/0xeb [<c062d968>] wait_for_completion+0x17/0x19 [<c043e161>] kthread_create+0x6e/0xc7 [<c062b7eb>] migration_call+0x39/0x444 [<c07ae112>] migration_init+0x1d/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0823490>] __key.17681+0x0/0x8 -> (&rq->lock){-.-.-.} ops: 854341 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0429f89>] scheduler_tick+0x39/0x19b [<c04354a4>] update_process_times+0x47/0x53 [<c0447fe0>] tick_periodic+0x6b/0x77 [<c0448009>] tick_handle_periodic+0x1d/0x60 [<c0404ace>] timer_interrupt+0x3e/0x45 [<c045e922>] handle_IRQ_event+0xa4/0x169 [<c04603a3>] handle_level_irq+0x78/0xc1 [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ede7>] task_rq_lock+0x3b/0x62 [<c0426e41>] try_to_wake_up+0x75/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c043507c>] process_timeout+0xd/0xf [<c0434caa>] run_timer_softirq+0x15b/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff IN-RECLAIM_FS-W at: [<c044dabd>] __lock_acquire+0x574/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ede7>] task_rq_lock+0x3b/0x62 [<c0427515>] set_cpus_allowed_ptr+0x1a/0xdd [<c0485cf8>] kswapd+0x61/0x51b [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c042398e>] rq_attach_root+0x17/0xa7 [<c07ae52c>] sched_init+0x240/0x33e [<c079d661>] start_kernel+0x107/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0800518>] __key.46938+0x0/0x8 -> (&vec->lock){-.-...} ops: 34058 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad3b>] cpupri_set+0x51/0xba [<c04219ee>] __enqueue_rt_entity+0xe2/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c04408b6>] hrtimer_wakeup+0x1d/0x21 [<c0440922>] __run_hrtimer+0x68/0x98 [<c04411ca>] hrtimer_interrupt+0x101/0x153 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c0401c4f>] cpu_idle+0x53/0x85 [<c061fc80>] rest_init+0x6c/0x6e [<c079d851>] start_kernel+0x2f7/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad3b>] cpupri_set+0x51/0xba [<c04219ee>] __enqueue_rt_entity+0xe2/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c042737c>] rebalance_domains+0x2a3/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad74>] cpupri_set+0x8a/0xba [<c04216f2>] rq_online_rt+0x5e/0x61 [<c041dd3a>] set_rq_online+0x40/0x4a [<c04239fb>] rq_attach_root+0x84/0xa7 [<c07ae52c>] sched_init+0x240/0x33e [<c079d661>] start_kernel+0x107/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0c525d0>] __key.14261+0x0/0x10 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad74>] cpupri_set+0x8a/0xba [<c04216f2>] rq_online_rt+0x5e/0x61 [<c041dd3a>] set_rq_online+0x40/0x4a [<c04239fb>] rq_attach_root+0x84/0xa7 [<c07ae52c>] sched_init+0x240/0x33e [<c079d661>] start_kernel+0x107/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff -> (&rt_b->rt_runtime_lock){-.-...} ops: 336 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421a75>] __enqueue_rt_entity+0x169/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c04408b6>] hrtimer_wakeup+0x1d/0x21 [<c0440922>] __run_hrtimer+0x68/0x98 [<c04411ca>] hrtimer_interrupt+0x101/0x153 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c0401c4f>] cpu_idle+0x53/0x85 [<c061fc80>] rest_init+0x6c/0x6e [<c079d851>] start_kernel+0x2f7/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421a75>] __enqueue_rt_entity+0x169/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c042737c>] rebalance_domains+0x2a3/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421a75>] __enqueue_rt_entity+0x169/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c062b86b>] migration_call+0xb9/0x444 [<c07ae130>] migration_init+0x3b/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0800504>] __key.37924+0x0/0x8 -> (&cpu_base->lock){-.-...} ops: 950512 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0440a3a>] hrtimer_run_queues+0xe8/0x131 [<c0435151>] run_local_timers+0xd/0x1e [<c0435486>] update_process_times+0x29/0x53 [<c0447fe0>] tick_periodic+0x6b/0x77 [<c0448009>] tick_handle_periodic+0x1d/0x60 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c04082c7>] arch_dup_task_struct+0x19/0x81 [<c042ac1c>] copy_process+0xab/0x115f [<c042be78>] do_fork+0x129/0x2c5 [<c0401698>] kernel_thread+0x7f/0x87 [<c043e0b3>] kthreadd+0xa3/0xe3 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0440b98>] lock_hrtimer_base+0x1d/0x38 [<c0440ca9>] __hrtimer_start_range_ns+0x1f/0x232 [<c0440ee7>] hrtimer_start_range_ns+0x15/0x17 [<c0448ef1>] tick_setup_sched_timer+0xf6/0x124 [<c0441558>] hrtimer_run_pending+0xb0/0xe8 [<c0434b76>] run_timer_softirq+0x27/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0440b98>] lock_hrtimer_base+0x1d/0x38 [<c0440ca9>] __hrtimer_start_range_ns+0x1f/0x232 [<c0421ab1>] __enqueue_rt_entity+0x1a5/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c062b86b>] migration_call+0xb9/0x444 [<c07ae130>] migration_init+0x3b/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c08234b8>] __key.20063+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0440b98>] lock_hrtimer_base+0x1d/0x38 [<c0440ca9>] __hrtimer_start_range_ns+0x1f/0x232 [<c0421ab1>] __enqueue_rt_entity+0x1a5/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c062b86b>] migration_call+0xb9/0x444 [<c07ae130>] migration_init+0x3b/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&rt_rq->rt_runtime_lock){-.....} ops: 17587 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421efc>] sched_rt_period_timer+0xda/0x24e [<c0440922>] __run_hrtimer+0x68/0x98 [<c04411ca>] hrtimer_interrupt+0x101/0x153 [<c063406e>] smp_apic_timer_interrupt+0x6e/0x81 [<c04033c7>] apic_timer_interrupt+0x2f/0x34 [<c0452203>] each_symbol_in_section+0x27/0x57 [<c045225a>] each_symbol+0x27/0x113 [<c0452373>] find_symbol+0x2d/0x51 [<c0454a7a>] load_module+0xaec/0x10eb [<c04550bf>] sys_init_module+0x46/0x19b [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421c41>] update_curr_rt+0x13a/0x20d [<c0421dd8>] dequeue_task_rt+0x13/0x3a [<c041df9e>] dequeue_task+0xff/0x10e [<c041dfd1>] deactivate_task+0x24/0x2a [<c062db54>] __schedule+0x162/0x991 [<c062e39a>] schedule+0x17/0x30 [<c0426c54>] migration_thread+0x175/0x203 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c080050c>] __key.46863+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ee73>] __enable_runtime+0x43/0xb3 [<c04216d8>] rq_online_rt+0x44/0x61 [<c041dd3a>] set_rq_online+0x40/0x4a [<c062b8a5>] migration_call+0xf3/0x444 [<c063291c>] notifier_call_chain+0x2b/0x4a [<c0441e22>] __raw_notifier_call_chain+0x13/0x15 [<c0441e35>] raw_notifier_call_chain+0x11/0x13 [<c062bd2f>] _cpu_up+0xc3/0xf6 [<c062bdac>] cpu_up+0x4a/0x5a [<c079d49a>] kernel_init+0x9a/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421a75>] __enqueue_rt_entity+0x169/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270d7>] wake_up_process+0x14/0x16 [<c062b86b>] migration_call+0xb9/0x444 [<c07ae130>] migration_init+0x3b/0x4b [<c040115c>] do_one_initcall+0x6a/0x16e [<c079d44d>] kernel_init+0x4d/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421c41>] update_curr_rt+0x13a/0x20d [<c0421dd8>] dequeue_task_rt+0x13/0x3a [<c041df9e>] dequeue_task+0xff/0x10e [<c041dfd1>] deactivate_task+0x24/0x2a [<c062db54>] __schedule+0x162/0x991 [<c062e39a>] schedule+0x17/0x30 [<c0426c54>] migration_thread+0x175/0x203 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&sig->cputimer.lock){......} ops: 1949 { INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043f03e>] thread_group_cputimer+0x29/0x90 [<c044004c>] posix_cpu_timers_exit_group+0x16/0x39 [<c042e5f0>] release_task+0xa2/0x376 [<c042fbe1>] do_exit+0x548/0x5b3 [<c043a9d8>] __request_module+0x0/0x100 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c08014ac>] __key.15480+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041f43a>] update_curr+0xef/0x107 [<c042131b>] enqueue_entity+0x1a/0x1c6 [<c0421535>] enqueue_task_fair+0x24/0x3e [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0426f9e>] try_to_wake_up+0x1d2/0x2d4 [<c04270b0>] default_wake_function+0x10/0x12 [<c041d785>] __wake_up_common+0x34/0x5f [<c041ec26>] complete+0x30/0x43 [<c043e1e8>] kthread+0x2e/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&rq->lock/1){..-...} ops: 3217 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630305>] _spin_lock_nested+0x2d/0x3e [<c0422cb4>] double_rq_lock+0x4b/0x7d [<c0427274>] rebalance_domains+0x19b/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630305>] _spin_lock_nested+0x2d/0x3e [<c0422cb4>] double_rq_lock+0x4b/0x7d [<c0427274>] rebalance_domains+0x19b/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff } ... key at: [<c0800519>] __key.46938+0x1/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c0421c41>] update_curr_rt+0x13a/0x20d [<c0421dd8>] dequeue_task_rt+0x13/0x3a [<c041df9e>] dequeue_task+0xff/0x10e [<c041dfd1>] deactivate_task+0x24/0x2a [<c0427b1b>] push_rt_task+0x189/0x1f7 [<c0427b9b>] push_rt_tasks+0x12/0x19 [<c0427bb9>] post_schedule_rt+0x17/0x21 [<c0425a68>] finish_task_switch+0x83/0xc0 [<c062e339>] __schedule+0x947/0x991 [<c062e39a>] schedule+0x17/0x30 [<c0426c54>] migration_thread+0x175/0x203 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c047ad3b>] cpupri_set+0x51/0xba [<c04219ee>] __enqueue_rt_entity+0xe2/0x1c8 [<c0421e18>] enqueue_rt_entity+0x19/0x23 [<c0428a52>] enqueue_task_rt+0x24/0x51 [<c041e03b>] enqueue_task+0x64/0x70 [<c041e06b>] activate_task+0x24/0x2a [<c0427b33>] push_rt_task+0x1a1/0x1f7 [<c0427b9b>] push_rt_tasks+0x12/0x19 [<c0427bb9>] post_schedule_rt+0x17/0x21 [<c0425a68>] finish_task_switch+0x83/0xc0 [<c062e339>] __schedule+0x947/0x991 [<c062e39a>] schedule+0x17/0x30 [<c0426c54>] migration_thread+0x175/0x203 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630305>] _spin_lock_nested+0x2d/0x3e [<c0422cb4>] double_rq_lock+0x4b/0x7d [<c0427274>] rebalance_domains+0x19b/0x3ac [<c0429a06>] run_rebalance_domains+0x32/0xaa [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ede7>] task_rq_lock+0x3b/0x62 [<c0426e41>] try_to_wake_up+0x75/0x2d4 [<c04270b0>] default_wake_function+0x10/0x12 [<c041d785>] __wake_up_common+0x34/0x5f [<c041ec26>] complete+0x30/0x43 [<c043e0cc>] kthreadd+0xbc/0xe3 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&ep->lock){......} ops: 110 { INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c04ca381>] sys_epoll_ctl+0x232/0x3f6 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff } ... key at: [<c0c5be90>] __key.22301+0x0/0x10 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c041ede7>] task_rq_lock+0x3b/0x62 [<c0426e41>] try_to_wake_up+0x75/0x2d4 [<c04270b0>] default_wake_function+0x10/0x12 [<c041d785>] __wake_up_common+0x34/0x5f [<c041d7c6>] __wake_up_locked+0x16/0x1a [<c04ca7f5>] ep_poll_callback+0x7c/0xb6 [<c041d785>] __wake_up_common+0x34/0x5f [<c041ec70>] __wake_up_sync_key+0x37/0x4a [<c05cbefa>] sock_def_readable+0x42/0x71 [<c061c8b1>] unix_stream_connect+0x2f3/0x368 [<c05c830a>] sys_connect+0x59/0x76 [<c05c963f>] sys_socketcall+0x76/0x172 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c04ca797>] ep_poll_callback+0x1e/0xb6 [<c041d785>] __wake_up_common+0x34/0x5f [<c041ec70>] __wake_up_sync_key+0x37/0x4a [<c05cbefa>] sock_def_readable+0x42/0x71 [<c061c8b1>] unix_stream_connect+0x2f3/0x368 [<c05c830a>] sys_connect+0x59/0x76 [<c05c963f>] sys_socketcall+0x76/0x172 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c041ec0d>] complete+0x17/0x43 [<c0509cf2>] blk_end_sync_rq+0x2a/0x2d [<c0506935>] end_that_request_last+0x17b/0x1a1 [<c0506a0d>] blk_end_io+0x51/0x6f [<c0506a64>] blk_end_request+0x11/0x13 [<f8106c9c>] scsi_io_completion+0x1d9/0x41f [scsi_mod] [<f810152d>] scsi_finish_command+0xcc/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff -> (&n->list_lock){..-...} ops: 49241 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c049bd18>] add_partial+0x16/0x40 [<c049d0d4>] __slab_free+0x96/0x28f [<c049df5c>] kmem_cache_free+0x8c/0xf2 [<c04a5ce9>] file_free_rcu+0x35/0x38 [<c0461a12>] rcu_process_callbacks+0x62/0x86 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c049bd18>] add_partial+0x16/0x40 [<c049d0d4>] __slab_free+0x96/0x28f [<c049df5c>] kmem_cache_free+0x8c/0xf2 [<c0514eda>] ida_get_new_above+0x13b/0x155 [<c0514f00>] ida_get_new+0xc/0xe [<c04a628b>] set_anon_super+0x39/0xa3 [<c04a68c6>] sget+0x2f3/0x386 [<c04a7365>] get_sb_single+0x24/0x8f [<c04e034c>] sysfs_get_sb+0x18/0x1a [<c04a6dd1>] vfs_kern_mount+0x40/0x7b [<c04a6e21>] kern_mount_data+0x15/0x17 [<c07b5ff6>] sysfs_init+0x50/0x9c [<c07b4ac9>] mnt_init+0x8c/0x1e4 [<c07b4737>] vfs_caches_init+0xd8/0xea [<c079d815>] start_kernel+0x2bb/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0c5a424>] __key.25358+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c049cc45>] __slab_alloc+0xf6/0x4ef [<c049d333>] kmem_cache_alloc+0x66/0x11f [<f810189b>] scsi_pool_alloc_command+0x20/0x4c [scsi_mod] [<f81018de>] scsi_host_alloc_command+0x17/0x4f [scsi_mod] [<f810192b>] __scsi_get_command+0x15/0x71 [scsi_mod] [<f8101c41>] scsi_get_command+0x39/0x95 [scsi_mod] [<f81062b6>] scsi_get_cmd_from_req+0x26/0x50 [scsi_mod] [<f8106594>] scsi_setup_blk_pc_cmnd+0x2b/0xd7 [scsi_mod] [<f8106664>] scsi_prep_fn+0x24/0x33 [scsi_mod] [<c0504712>] elv_next_request+0xe6/0x18d [<f810704c>] scsi_request_fn+0x69/0x431 [scsi_mod] [<c05072af>] __generic_unplug_device+0x2e/0x31 [<c0509d59>] blk_execute_rq_nowait+0x64/0x86 [<c0509e2e>] blk_execute_rq+0xb3/0xd5 [<f81068f5>] scsi_execute+0xc5/0x11c [scsi_mod] [<f81069ff>] scsi_execute_req+0xb3/0x104 [scsi_mod] [<f812b40d>] sd_revalidate_disk+0x1a3/0xf64 [sd_mod] [<f812d52f>] sd_probe_async+0x146/0x22d [sd_mod] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff -> (&cwq->lock){-.-...} ops: 30335 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043b54b>] __queue_work+0x14/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043b64f>] schedule_work+0x14/0x16 [<c057a367>] schedule_console_callback+0x12/0x14 [<c05788ed>] kbd_event+0x595/0x600 [<c05b3d15>] input_pass_event+0x56/0x7e [<c05b4702>] input_handle_event+0x314/0x334 [<c05b4f1e>] input_event+0x50/0x63 [<c05b9bd4>] atkbd_interrupt+0x209/0x4e9 [<c05b1793>] serio_interrupt+0x38/0x6e [<c05b24e8>] i8042_interrupt+0x1db/0x1ec [<c045e922>] handle_IRQ_event+0xa4/0x169 [<c04602ea>] handle_edge_irq+0xc9/0x10a [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043b54b>] __queue_work+0x14/0x30 [<c043b590>] delayed_work_timer_fn+0x29/0x2d [<c0434caa>] run_timer_softirq+0x15b/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043b54b>] __queue_work+0x14/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043a7b3>] call_usermodehelper_exec+0x83/0xd0 [<c051631a>] kobject_uevent_env+0x351/0x385 [<c0516358>] kobject_uevent+0xa/0xc [<c0515a0e>] kset_register+0x2e/0x34 [<c0590f18>] bus_register+0xed/0x23d [<c07bea09>] platform_bus_init+0x23/0x38 [<c07beb77>] driver_init+0x1c/0x28 [<c079d4f6>] kernel_init+0xf6/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c08230a8>] __key.23814+0x0/0x8 -> (&workqueue_cpu_stat(cpu)->lock){-.-...} ops: 20397 { IN-HARDIRQ-W at: [<c044d9e4>] __lock_acquire+0x49b/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0474909>] probe_workqueue_insertion+0x33/0x81 [<c043acf3>] insert_work+0x3f/0x9b [<c043b559>] __queue_work+0x22/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043b64f>] schedule_work+0x14/0x16 [<c057a367>] schedule_console_callback+0x12/0x14 [<c05788ed>] kbd_event+0x595/0x600 [<c05b3d15>] input_pass_event+0x56/0x7e [<c05b4702>] input_handle_event+0x314/0x334 [<c05b4f1e>] input_event+0x50/0x63 [<c05b9bd4>] atkbd_interrupt+0x209/0x4e9 [<c05b1793>] serio_interrupt+0x38/0x6e [<c05b24e8>] i8042_interrupt+0x1db/0x1ec [<c045e922>] handle_IRQ_event+0xa4/0x169 [<c04602ea>] handle_edge_irq+0xc9/0x10a [<ffffffff>] 0xffffffff IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0474909>] probe_workqueue_insertion+0x33/0x81 [<c043acf3>] insert_work+0x3f/0x9b [<c043b559>] __queue_work+0x22/0x30 [<c043b590>] delayed_work_timer_fn+0x29/0x2d [<c0434caa>] run_timer_softirq+0x15b/0x1d1 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c04747eb>] probe_workqueue_creation+0xc9/0x10a [<c043abcb>] create_workqueue_thread+0x87/0xb0 [<c043b12f>] __create_workqueue_key+0x16d/0x1b2 [<c07aeedb>] init_workqueues+0x61/0x73 [<c079d4e7>] kernel_init+0xe7/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c52574>] __key.23424+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0474909>] probe_workqueue_insertion+0x33/0x81 [<c043acf3>] insert_work+0x3f/0x9b [<c043b559>] __queue_work+0x22/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043a7b3>] call_usermodehelper_exec+0x83/0xd0 [<c051631a>] kobject_uevent_env+0x351/0x385 [<c0516358>] kobject_uevent+0xa/0xc [<c0515a0e>] kset_register+0x2e/0x34 [<c0590f18>] bus_register+0xed/0x23d [<c07bea09>] platform_bus_init+0x23/0x38 [<c07beb77>] driver_init+0x1c/0x28 [<c079d4f6>] kernel_init+0xf6/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c041ecaf>] __wake_up+0x1a/0x40 [<c043ad46>] insert_work+0x92/0x9b [<c043b559>] __queue_work+0x22/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c043a7b3>] call_usermodehelper_exec+0x83/0xd0 [<c051631a>] kobject_uevent_env+0x351/0x385 [<c0516358>] kobject_uevent+0xa/0xc [<c0515a0e>] kset_register+0x2e/0x34 [<c0590f18>] bus_register+0xed/0x23d [<c07bea09>] platform_bus_init+0x23/0x38 [<c07beb77>] driver_init+0x1c/0x28 [<c079d4f6>] kernel_init+0xf6/0x15a [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c043b54b>] __queue_work+0x14/0x30 [<c043b5ce>] queue_work_on+0x3a/0x46 [<c043b617>] queue_work+0x26/0x4a [<c0505679>] kblockd_schedule_work+0x12/0x14 [<c05113bb>] elv_schedule_dispatch+0x41/0x48 [<c0513377>] elv_ioq_completed_request+0x2dc/0x2fe [<c05045aa>] elv_completed_request+0x48/0x97 [<c0506738>] __blk_put_request+0x36/0xb8 [<c0506953>] end_that_request_last+0x199/0x1a1 [<c0506a0d>] blk_end_io+0x51/0x6f [<c0506a64>] blk_end_request+0x11/0x13 [<f8106c9c>] scsi_io_completion+0x1d9/0x41f [scsi_mod] [<f810152d>] scsi_finish_command+0xcc/0xd4 [scsi_mod] [<f8106fdb>] scsi_softirq_done+0xf9/0x101 [scsi_mod] [<c050a936>] blk_done_softirq+0x5e/0x70 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff -> (&zone->lock){..-...} ops: 80266 { IN-SOFTIRQ-W at: [<c044da08>] __lock_acquire+0x4bf/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c047fc71>] __free_pages_ok+0x167/0x321 [<c04800ce>] __free_pages+0x29/0x2b [<c049c7c1>] __free_slab+0xb2/0xba [<c049c800>] discard_slab+0x37/0x39 [<c049d15c>] __slab_free+0x11e/0x28f [<c049df5c>] kmem_cache_free+0x8c/0xf2 [<c042ab6e>] free_task+0x31/0x34 [<c042c37b>] __put_task_struct+0xd3/0xd8 [<c042e072>] delayed_put_task_struct+0x60/0x64 [<c0461a12>] rcu_process_callbacks+0x62/0x86 [<c0431379>] __do_softirq+0xb8/0x180 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c047f7b6>] free_pages_bulk+0x21/0x1a1 [<c047ffcf>] free_hot_cold_page+0x181/0x20f [<c04800a3>] free_hot_page+0xf/0x11 [<c04800c5>] __free_pages+0x20/0x2b [<c07c4d96>] __free_pages_bootmem+0x6d/0x71 [<c07b2244>] free_all_bootmem_core+0xd2/0x177 [<c07b22f6>] free_all_bootmem+0xd/0xf [<c07ad21a>] mem_init+0x28/0x28c [<c079d7b1>] start_kernel+0x257/0x2fc [<c079d06a>] __init_begin+0x6a/0x6f [<ffffffff>] 0xffffffff } ... key at: [<c0c52628>] __key.30749+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c048035e>] get_page_from_freelist+0x236/0x3e3 [<c04805f4>] __alloc_pages_internal+0xce/0x371 [<c049cce6>] __slab_alloc+0x197/0x4ef [<c049d333>] kmem_cache_alloc+0x66/0x11f [<c047d96b>] mempool_alloc_slab+0x13/0x15 [<c047da5c>] mempool_alloc+0x3a/0xd5 [<f81063cc>] scsi_sg_alloc+0x47/0x4a [scsi_mod] [<c051cd02>] __sg_alloc_table+0x48/0xc7 [<f8106325>] scsi_init_sgtable+0x2c/0x8c [scsi_mod] [<f81064e7>] scsi_init_io+0x19/0x9b [scsi_mod] [<f8106abf>] scsi_setup_fs_cmnd+0x6f/0x73 [scsi_mod] [<f812ca73>] sd_prep_fn+0x6a/0x7d4 [sd_mod] [<c0504712>] elv_next_request+0xe6/0x18d [<f810704c>] scsi_request_fn+0x69/0x431 [scsi_mod] [<c05072af>] __generic_unplug_device+0x2e/0x31 [<c05072db>] blk_start_queueing+0x29/0x2b [<c05137b8>] elv_ioq_request_add+0x2be/0x393 [<c05048cd>] elv_insert+0x114/0x1a2 [<c05049ec>] __elv_add_request+0x91/0x96 [<c0507a00>] __make_request+0x365/0x397 [<c050635a>] generic_make_request+0x342/0x3ce [<c0507b21>] submit_bio+0xef/0xfa [<c04c6c4e>] mpage_bio_submit+0x21/0x26 [<c04c7b7f>] mpage_readpages+0xa3/0xad [<f80c1ea8>] ext3_readpages+0x19/0x1b [ext3] [<c048275e>] __do_page_cache_readahead+0xfd/0x166 [<c0482b42>] do_page_cache_readahead+0x44/0x52 [<c047d665>] filemap_fault+0x197/0x3ae [<c048b9ea>] __do_fault+0x40/0x37b [<c048d43f>] handle_mm_fault+0x2bb/0x646 [<c063273c>] do_page_fault+0x29c/0x2fd [<c0630b4a>] error_code+0x72/0x78 [<ffffffff>] 0xffffffff -> (&page_address_htable[i].lock){......} ops: 6802 { INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c048af69>] page_address+0x50/0xa6 [<c048b0e7>] kmap_high+0x21/0x175 [<c041b7ef>] kmap+0x4e/0x5b [<c04abb36>] page_getlink+0x37/0x59 [<c04abb75>] page_follow_link_light+0x1d/0x2b [<c04ad4d0>] __link_path_walk+0x3d1/0xa71 [<c04adbae>] path_walk+0x3e/0x77 [<c04add0e>] do_path_lookup+0xeb/0x105 [<c04ae6f2>] path_lookup_open+0x48/0x7a [<c04a8e96>] open_exec+0x25/0xf4 [<c04a9c2d>] do_execve+0xfa/0x2cc [<c04015c0>] sys_execve+0x2b/0x54 [<c0402ae9>] syscall_call+0x7/0xb [<ffffffff>] 0xffffffff } ... key at: [<c0c5288c>] __key.28547+0x0/0x14 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c048af69>] page_address+0x50/0xa6 [<c05078a1>] __make_request+0x206/0x397 [<c050635a>] generic_make_request+0x342/0x3ce [<c0507b21>] submit_bio+0xef/0xfa [<c04c6c4e>] mpage_bio_submit+0x21/0x26 [<c04c78b8>] do_mpage_readpage+0x471/0x5e5 [<c04c7b55>] mpage_readpages+0x79/0xad [<f80c1ea8>] ext3_readpages+0x19/0x1b [ext3] [<c048275e>] __do_page_cache_readahead+0xfd/0x166 [<c0482b42>] do_page_cache_readahead+0x44/0x52 [<c047d665>] filemap_fault+0x197/0x3ae [<c048b9ea>] __do_fault+0x40/0x37b [<c048d43f>] handle_mm_fault+0x2bb/0x646 [<c063273c>] do_page_fault+0x29c/0x2fd [<c0630b4a>] error_code+0x72/0x78 [<ffffffff>] 0xffffffff ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c0630340>] _spin_lock+0x2a/0x39 [<c046143d>] call_rcu+0x36/0x5b [<c050f0c8>] cfq_cic_free+0x15/0x17 [<c050f128>] cic_free_func+0x5e/0x64 [<c050ea90>] __call_for_each_cic+0x23/0x2e [<c050eaad>] cfq_free_io_context+0x12/0x14 [<c050978c>] put_io_context+0x4b/0x66 [<c050f00a>] cfq_active_ioq_reset+0x21/0x39 [<c0511044>] elv_reset_active_ioq+0x2b/0x3e [<c0512ecf>] __elv_ioq_slice_expired+0x238/0x26a [<c0512f1f>] elv_ioq_slice_expired+0x1e/0x20 [<c0513860>] elv_ioq_request_add+0x366/0x393 [<c05048cd>] elv_insert+0x114/0x1a2 [<c05049ec>] __elv_add_request+0x91/0x96 [<c0507a00>] __make_request+0x365/0x397 [<c050635a>] generic_make_request+0x342/0x3ce [<c0507b21>] submit_bio+0xef/0xfa [<c04bf495>] submit_bh+0xe3/0x102 [<c04c04b0>] ll_rw_block+0xbe/0xf7 [<f80c35ba>] ext3_bread+0x39/0x79 [ext3] [<f80c5643>] dx_probe+0x2f/0x298 [ext3] [<f80c5956>] ext3_find_entry+0xaa/0x573 [ext3] [<f80c739e>] ext3_lookup+0x31/0xbe [ext3] [<c04abf7c>] do_lookup+0xbc/0x159 [<c04ad7e8>] __link_path_walk+0x6e9/0xa71 [<c04adbae>] path_walk+0x3e/0x77 [<c04add0e>] do_path_lookup+0xeb/0x105 [<c04ae584>] user_path_at+0x41/0x6c [<c04a8301>] vfs_fstatat+0x32/0x59 [<c04a8417>] vfs_stat+0x18/0x1a [<c04a8432>] sys_stat64+0x19/0x2d [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff -> (&iocg->lock){+.+...} ops: 3 { HARDIRQ-ON-W at: [<c044b840>] mark_held_locks+0x3d/0x58 [<c044b963>] trace_hardirqs_on_caller+0x108/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff SOFTIRQ-ON-W at: [<c044b840>] mark_held_locks+0x3d/0x58 [<c044b97c>] trace_hardirqs_on_caller+0x121/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff INITIAL USE at: [<c044dad5>] __lock_acquire+0x58c/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c06304ea>] _spin_lock_irq+0x30/0x3f [<c05119bd>] io_alloc_root_group+0x104/0x155 [<c05133cb>] elv_init_fq_data+0x32/0xe0 [<c0504317>] elevator_alloc+0x150/0x170 [<c0505393>] elevator_init+0x9d/0x100 [<c0507088>] blk_init_queue_node+0xc4/0xf7 [<c05070cb>] blk_init_queue+0x10/0x12 [<f81060fd>] __scsi_alloc_queue+0x1c/0xba [scsi_mod] [<f81061b0>] scsi_alloc_queue+0x15/0x4e [scsi_mod] [<f810803d>] scsi_alloc_sdev+0x154/0x1f5 [scsi_mod] [<f8108387>] scsi_probe_and_add_lun+0x123/0xb5b [scsi_mod] [<f8109847>] __scsi_add_device+0x8a/0xb0 [scsi_mod] [<f816ad14>] ata_scsi_scan_host+0x77/0x141 [libata] [<f816903f>] async_port_probe+0xa0/0xa9 [libata] [<c044341f>] async_thread+0xe9/0x1c9 [<c043e204>] kthread+0x4a/0x72 [<c04034e7>] kernel_thread_helper+0x7/0x10 [<ffffffff>] 0xffffffff } ... key at: [<c0c5ebd8>] __key.29462+0x0/0x8 ... acquired at: [<c044d243>] validate_chain+0x8a8/0xbae [<c044dbfd>] __lock_acquire+0x6b4/0x73e [<c044dd36>] lock_acquire+0xaf/0xcc [<c063056b>] _spin_lock_irqsave+0x33/0x43 [<c0510f6f>] io_group_chain_link+0x5c/0x106 [<c0511ba7>] io_find_alloc_group+0x54/0x60 [<c0511c11>] io_get_io_group_bio+0x5e/0x89 [<c0511cc3>] io_group_get_request_list+0x12/0x21 [<c0507485>] get_request_wait+0x124/0x15d [<c050797e>] __make_request+0x2e3/0x397 [<c050635a>] generic_make_request+0x342/0x3ce [<c0507b21>] submit_bio+0xef/0xfa [<c04c6c4e>] mpage_bio_submit+0x21/0x26 [<c04c7b7f>] mpage_readpages+0xa3/0xad [<f80c1ea8>] ext3_readpages+0x19/0x1b [ext3] [<c048275e>] __do_page_cache_readahead+0xfd/0x166 [<c048294a>] ondemand_readahead+0x10a/0x118 [<c04829db>] page_cache_sync_readahead+0x1b/0x20 [<c047cf37>] generic_file_aio_read+0x226/0x545 [<c04a4cf6>] do_sync_read+0xb0/0xee [<c04a54b0>] vfs_read+0x8f/0x136 [<c04a8d7c>] kernel_read+0x39/0x4b [<c04a8e69>] prepare_binprm+0xdb/0xe3 [<c04a9ca8>] do_execve+0x175/0x2cc [<c04015c0>] sys_execve+0x2b/0x54 [<c0402a68>] sysenter_do_call+0x12/0x36 [<ffffffff>] 0xffffffff stack backtrace: Pid: 2186, comm: rmdir Not tainted 2.6.30-rc4-io #6 Call Trace: [<c044b1ac>] print_irq_inversion_bug+0x13b/0x147 [<c044c3e5>] check_usage_backwards+0x7d/0x86 [<c044b5ec>] mark_lock+0x2d3/0x4ea [<c044c368>] ? check_usage_backwards+0x0/0x86 [<c044b840>] mark_held_locks+0x3d/0x58 [<c0630883>] ? _spin_unlock_irq+0x27/0x47 [<c044b97c>] trace_hardirqs_on_caller+0x121/0x14c [<c044b9b2>] trace_hardirqs_on+0xb/0xd [<c0630883>] _spin_unlock_irq+0x27/0x47 [<c0513baa>] iocg_destroy+0xbc/0x118 [<c045a16a>] cgroup_diput+0x4b/0xa7 [<c04b1dbb>] dentry_iput+0x78/0x9c [<c04b1e82>] d_kill+0x21/0x3b [<c04b2f2a>] dput+0xf3/0xfc [<c04ae226>] do_rmdir+0x9a/0xc8 [<c04029b1>] ? resume_userspace+0x11/0x28 [<c051aa14>] ? trace_hardirqs_on_thunk+0xc/0x10 [<c0402b34>] ? restore_nocheck_notrace+0x0/0xe [<c06324a0>] ? do_page_fault+0x0/0x2fd [<c044b97c>] ? trace_hardirqs_on_caller+0x121/0x14c [<c04ae29d>] sys_rmdir+0x15/0x17 [<c0402a68>] sysenter_do_call+0x12/0x36 [-- Attachment #4: Type: text/plain, Size: 206 bytes --] _______________________________________________ Containers mailing list Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 97+ messages in thread
* Re: IO scheduler based IO Controller V2 [not found] ` <20090506161012.GC8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-07 5:36 ` Li Zefan @ 2009-05-07 5:47 ` Gui Jianfeng 1 sibling, 0 replies; 97+ messages in thread From: Gui Jianfeng @ 2009-05-07 5:47 UTC (permalink / raw) To: Vivek Goyal Cc: dhaval-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, snitzer-H+wXaHxf7aLQT0dZR+AlfA, dm-devel-H+wXaHxf7aLQT0dZR+AlfA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, agk-H+wXaHxf7aLQT0dZR+AlfA, balbir-23VcF4HTsmIX0ybBhKVfKdBPR1lH4CV8, paolo.valente-rcYM44yAMweonA0d6jMUrA, fernando-gVGce1chcLdL9jVzuh4AOg, jmoyer-H+wXaHxf7aLQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA, linux-kernel-u79uwXL29TY76Z2rM5mHXA, akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b, righi.andrea-Re5JQEeQqe8AvxtiuMwx3w [-- Attachment #1: Type: text/plain, Size: 2218 bytes --] Vivek Goyal wrote: > Hi Gui, > > Thanks for the report. I use cgroup_path() for debugging. I guess that > cgroup_path() was passed null cgrp pointer that's why it crashed. > > If yes, then it is strange though. I call cgroup_path() only after > grabbing a refenrece to css object. (I am assuming that if I have a valid > reference to css object then css->cgrp can't be null). I think so too... > > Anyway, can you please try out following patch and see if it fixes your > crash. > > --- > block/elevator-fq.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > Index: linux11/block/elevator-fq.c > =================================================================== > --- linux11.orig/block/elevator-fq.c 2009-05-05 15:38:06.000000000 -0400 > +++ linux11/block/elevator-fq.c 2009-05-06 11:55:47.000000000 -0400 > @@ -125,6 +125,9 @@ static void io_group_path(struct io_grou > unsigned short id = iog->iocg_id; > struct cgroup_subsys_state *css; > > + /* For error case */ > + buf[0] = '\0'; > + > rcu_read_lock(); > > if (!id) > @@ -137,15 +140,12 @@ static void io_group_path(struct io_grou > if (!css_tryget(css)) > goto out; > > - cgroup_path(css->cgroup, buf, buflen); > + if (css->cgroup) According to CR2, when kernel crashing, css->cgroup equals 0x00000100. So i guess this patch won't fix this issue. > + cgroup_path(css->cgroup, buf, buflen); > > css_put(css); > - > - rcu_read_unlock(); > - return; > out: > rcu_read_unlock(); > - buf[0] = '\0'; > return; > } > #endif > > BTW, I tried following equivalent script and I can't see the crash on > my system. Are you able to hit it regularly? yes, it's 50% chance that i can reproduce it. i'v attached the rwio source code. > > Instead of killing the tasks I also tried moving the tasks into root cgroup > and then deleting test1 and test2 groups, that also did not produce any crash. > (Hit a different bug though after 5-6 attempts :-) > > As I mentioned in the patchset, currently we do have issues with group > refcounting and cgroup/group going away. Hopefully in next version they > all should be fixed up. But still, it is nice to hear back... > > -- Regards Gui Jianfeng [-- Attachment #2: rwio.c --] [-- Type: image/x-xbitmap, Size: 1613 bytes --] [-- Attachment #3: Type: text/plain, Size: 206 bytes --] _______________________________________________ Containers mailing list Containers-cunTk1MwBs9QetFLy7KEm3xJsTq8ys+cHZ5vskTnxNA@public.gmane.org https://lists.linux-foundation.org/mailman/listinfo/containers ^ permalink raw reply [flat|nested] 97+ messages in thread
* IO scheduler based IO Controller V2 @ 2009-05-05 19:58 Vivek Goyal 0 siblings, 0 replies; 97+ messages in thread From: Vivek Goyal @ 2009-05-05 19:58 UTC (permalink / raw) To: nauman-hpIqsD4AKlfQT0dZR+AlfA, dpshah-hpIqsD4AKlfQT0dZR+AlfA, lizf-BthXqXjhjHXQFUHtdCDX3A, mikew-hpIqsD4AKlfQT0dZR+AlfA, fchecconi-Re5JQEeQqe8AvxtiuMwx3w, paolo.valente-rcYM44yAMweonA0d6jMUrA, jens.axboe-QHcLZuEGTsvQT0dZR+AlfA, ryov-jCdQPDEk3idL9jVzuh4AOg, fer Cc: akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b Hi All, Here is the V2 of the IO controller patches generated on top of 2.6.30-rc4. First version of the patches was posted here. http://lkml.org/lkml/2009/3/11/486 This patchset is still work in progress but I want to keep on getting the snapshot of my tree out at regular intervals to get the feedback hence V2. Before I go into details of what are the major changes from V1, wanted to highlight other IO controller proposals on lkml. Other active IO controller proposals ------------------------------------ Currently primarily two other IO controller proposals are out there. dm-ioband --------- This patch set is from Ryo Tsuruta from valinux. It is a proportional bandwidth controller implemented as a dm driver. http://people.valinux.co.jp/~ryov/dm-ioband/ The biggest issue (apart from others), with a 2nd level IO controller is that buffering of BIOs takes place in a single queue and dispatch of this BIOs to unerlying IO scheduler is in FIFO manner. That means whenever the buffering takes place, it breaks the notion of different class and priority of CFQ. That means RT requests might be stuck behind some write requests or some read requests might be stuck behind somet write requests for long time etc. To demonstrate the single FIFO dispatch issues, I had run some basic tests and posted the results in following mail thread. http://lkml.org/lkml/2009/4/13/2 These are hard to solve issues and one will end up maintaining the separate queues for separate classes and priority as CFQ does to fully resolve it. But that will make 2nd level implementation complex at the same time if somebody is trying to use IO controller on a single disk or on a hardware RAID using cfq as scheduler, it will be two layers of queueing maintating separate queues per priorty level. One at dm-driver level and other at CFQ which again does not make lot of sense. On the other hand, if a user is running noop at the device level, at higher level we will be maintaining multiple cfq like queues, which also does not make sense as underlying IO scheduler never wanted that. Hence, IMHO, I think that controlling bio at second level probably is not a very good idea. We should instead do it at IO scheduler level where we already maintain all the needed queues. Just that make the scheduling hierarhical and group aware so isolate IO of one group from other. IO-throttling ------------- This patch set is from Andrea Righi provides max bandwidth controller. That means, it does not gurantee the minimum bandwidth. It provides the maximum bandwidth limits and throttles the application if it crosses its bandwidth. So its not apple vs apple comparison. This patch set and dm-ioband provide proportional bandwidth control where a cgroup can use much more bandwidth if there are not other users and resource control comes into the picture only if there is contention. It seems that there are both the kind of users there. One set of people needing proportional BW control and other people needing max bandwidth control. Now the question is, where max bandwidth control should be implemented? At higher layers or at IO scheduler level? Should proportional bw control and max bw control be implemented separately at different layer or these should be implemented at one place? IMHO, if we are doing proportional bw control at IO scheduler layer, it should be possible to extend it to do max bw control also here without lot of effort. Then it probably does not make too much of sense to do two types of control at two different layers. Doing it at one place should lead to lesser code and reduced complexity. Secondly, io-throttling solution also buffers writes at higher layer. Which again will lead to issue of losing the notion of priority of writes. Hence, personally I think that users will need both proportional bw as well as max bw control and we probably should implement these at a single place instead of splitting it. Once elevator based io controller patchset matures, it can be enhanced to do max bw control also. Having said that, one issue with doing upper limit control at elevator/IO scheduler level is that it does not have the view of higher level logical devices. So if there is a software RAID with two disks, then one can not do max bw control on logical device, instead it shall have to be on leaf node where io scheduler is attached. Now back to the desciption of this patchset and changes from V1. - Rebased patches to 2.6.30-rc4. - Last time Andrew mentioned that async writes are big issue for us hence, introduced the control for async writes also. - Implemented per group request descriptor support. This was needed to make sure one group doing lot of IO does not starve other group of request descriptors and other group does not get fair share. This is a basic patch right now which probably will require more changes after some discussion. - Exported the disk time used and number of sectors dispatched by a cgroup through cgroup interface. This should help us in seeing how much disk time each group got and whether it is fair or not. - Implemented group refcounting support. Lack of this was causing some cgroup related issues. There are still some races left out which needs to be fixed. - For IO tracking/async write tracking, started making use of patches of blkio-cgroup from ryo Tsuruta posted here. http://lkml.org/lkml/2009/4/28/235 Currently people seem to be liking the idea of separate subsystem for tracking writes and then rest of the users can use that info instead of everybody implementing their own. That's a different thing that how many users are out there which will end up in kernel is not clear. So instead of carrying own versin of bio-cgroup patches, and overloading io controller cgroup subsystem, I am making use of blkio-cgroup patches. One shall have to mount io controller and blkio subsystem together on the same hiearchy for the time being. Later we can take care of the case where blkio is mounted on a different hierarchy. - Replaced group priorities with group weights. Testing ======= Again, I have been able to do only very basic testing of reads and writes. Did not want to hold the patches back because of testing. Providing support for async writes took much more time than expected and still work is left in that area. Will continue to do more testing. Test1 (Fairness for synchronous reads) ====================================== - Two dd in two cgroups with cgrop weights 1000 and 500. Ran two "dd" in those cgroups (With CFQ scheduler and /sys/block/<device>/queue/fairness = 1) dd if=/mnt/$BLOCKDEV/zerofile1 of=/dev/null & dd if=/mnt/$BLOCKDEV/zerofile2 of=/dev/null & 234179072 bytes (234 MB) copied, 4.13954 s, 56.6 MB/s 234179072 bytes (234 MB) copied, 5.2127 s, 44.9 MB/s group1 time=3108 group1 sectors=460968 group2 time=1405 group2 sectors=264944 This patchset tries to provide fairness in terms of disk time received. group1 got almost double of group2 disk time (At the time of first dd finish). These time and sectors statistics can be read using io.disk_time and io.disk_sector files in cgroup. More about it in documentation file. Test2 (Fairness for async writes) ================================= Fairness for async writes is tricy and biggest reason is that async writes are cached in higher layers (page cahe) and are dispatched to lower layers not necessarily in proportional manner. For example, consider two dd threads reading /dev/zero as input file and doing writes of huge files. Very soon we will cross vm_dirty_ratio and dd thread will be forced to write out some pages to disk before more pages can be dirtied. But not necessarily dirty pages of same thread are picked. It can very well pick the inode of lesser priority dd thread and do some writeout. So effectively higher weight dd is doing writeouts of lower weight dd pages and we don't see service differentation IOW, the core problem with async write fairness is that higher weight thread does not throw enought IO traffic at IO controller to keep the queue continuously backlogged. This are many .2 to .8 second intervals where higher weight queue is empty and in that duration lower weight queue get lots of job done giving the impression that there was no service differentiation. In summary, from IO controller point of view async writes support is there. Now we need to do some more work in higher layers to make sure higher weight process is not blocked behind IO of some lower weight process. This is a TODO item. So to test async writes I generated lots of write traffic in two cgroups (50 fio threads) and watched the disk time statistics in respective cgroups at the interval of 2 seconds. Thanks to ryo tsuruta for the test case. ***************************************************************** sync echo 3 > /proc/sys/vm/drop_caches fio_args="--size=64m --rw=write --numjobs=50 --group_reporting" echo $$ > /cgroup/bfqio/test1/tasks fio $fio_args --name=test1 --directory=/mnt/sdd1/fio/ --output=/mnt/sdd1/fio/test1.log & echo $$ > /cgroup/bfqio/test2/tasks fio $fio_args --name=test2 --directory=/mnt/sdd2/fio/ --output=/mnt/sdd2/fio/test2.log & *********************************************************************** And watched the disk time and sector statistics for the both the cgroups every 2 seconds using a script. How is snippet from output. test1 statistics: time=9848 sectors=643152 test2 statistics: time=5224 sectors=258600 test1 statistics: time=11736 sectors=785792 test2 statistics: time=6509 sectors=333160 test1 statistics: time=13607 sectors=943968 test2 statistics: time=7443 sectors=394352 test1 statistics: time=15662 sectors=1089496 test2 statistics: time=8568 sectors=451152 So disk time consumed by group1 is almost double of group2. Your feedback and comments are welcome. Thanks Vivek ^ permalink raw reply [flat|nested] 97+ messages in thread
end of thread, other threads:[~2009-05-14 16:44 UTC | newest] Thread overview: 97+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2009-05-05 19:58 IO scheduler based IO Controller V2 Vivek Goyal 2009-05-05 19:58 Vivek Goyal [not found] ` <1241553525-28095-1-git-send-email-vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-05 20:24 ` Andrew Morton 2009-05-05 20:24 ` Andrew Morton 2009-05-05 22:20 ` Peter Zijlstra 2009-05-06 3:42 ` Balbir Singh 2009-05-06 3:42 ` Balbir Singh 2009-05-06 10:20 ` Fabio Checconi 2009-05-06 17:10 ` Balbir Singh 2009-05-06 17:10 ` Balbir Singh [not found] ` <20090506102030.GB20544-f9ZlEuEWxVeACYmtYXMKmw@public.gmane.org> 2009-05-06 17:10 ` Balbir Singh 2009-05-06 18:47 ` Divyesh Shah [not found] ` <20090506034254.GD4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> 2009-05-06 10:20 ` Fabio Checconi 2009-05-06 18:47 ` Divyesh Shah 2009-05-06 20:42 ` Andrea Righi 2009-05-06 20:42 ` Andrea Righi 2009-05-06 2:33 ` Vivek Goyal 2009-05-06 17:59 ` Nauman Rafique 2009-05-06 20:07 ` Andrea Righi 2009-05-06 21:21 ` Vivek Goyal 2009-05-06 21:21 ` Vivek Goyal [not found] ` <20090506212121.GI8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-06 22:02 ` Andrea Righi 2009-05-06 22:02 ` Andrea Righi 2009-05-06 22:17 ` Vivek Goyal 2009-05-06 22:17 ` Vivek Goyal [not found] ` <20090506023332.GA1212-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-06 17:59 ` Nauman Rafique 2009-05-06 20:07 ` Andrea Righi 2009-05-06 20:32 ` Vivek Goyal 2009-05-07 0:18 ` Ryo Tsuruta 2009-05-06 20:32 ` Vivek Goyal [not found] ` <20090506203228.GH8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-06 21:34 ` Andrea Righi 2009-05-06 21:34 ` Andrea Righi 2009-05-06 21:52 ` Vivek Goyal 2009-05-06 21:52 ` Vivek Goyal 2009-05-06 22:35 ` Andrea Righi 2009-05-07 1:48 ` Ryo Tsuruta 2009-05-07 1:48 ` Ryo Tsuruta [not found] ` <20090506215235.GJ8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-06 22:35 ` Andrea Righi 2009-05-07 9:04 ` Andrea Righi 2009-05-07 9:04 ` Andrea Righi 2009-05-07 12:22 ` Andrea Righi 2009-05-07 12:22 ` Andrea Righi 2009-05-07 14:11 ` Vivek Goyal 2009-05-07 14:11 ` Vivek Goyal [not found] ` <20090507141126.GA9463-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-07 14:45 ` Vivek Goyal 2009-05-07 14:45 ` Vivek Goyal [not found] ` <20090507144501.GB9463-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-07 15:36 ` Vivek Goyal 2009-05-07 15:36 ` Vivek Goyal [not found] ` <20090507153642.GC9463-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-07 15:42 ` Vivek Goyal 2009-05-07 15:42 ` Vivek Goyal 2009-05-07 22:19 ` Andrea Righi 2009-05-07 22:19 ` Andrea Righi 2009-05-08 18:09 ` Vivek Goyal 2009-05-08 20:05 ` Andrea Righi 2009-05-08 21:56 ` Vivek Goyal 2009-05-08 21:56 ` Vivek Goyal 2009-05-09 9:22 ` Peter Zijlstra 2009-05-14 10:31 ` Andrea Righi [not found] ` <20090508215618.GJ7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-09 9:22 ` Peter Zijlstra 2009-05-14 10:31 ` Andrea Righi 2009-05-14 16:43 ` Dhaval Giani 2009-05-14 16:43 ` Dhaval Giani [not found] ` <20090508180951.GG7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-08 20:05 ` Andrea Righi 2009-05-08 18:09 ` Vivek Goyal 2009-05-07 22:40 ` Andrea Righi 2009-05-07 22:40 ` Andrea Righi 2009-05-07 0:18 ` Ryo Tsuruta [not found] ` <20090507.091858.226775723.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org> 2009-05-07 1:25 ` Vivek Goyal 2009-05-07 1:25 ` Vivek Goyal [not found] ` <20090507012559.GC4187-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-11 11:23 ` Ryo Tsuruta 2009-05-11 11:23 ` Ryo Tsuruta [not found] ` <20090511.202309.112614168.ryov-jCdQPDEk3idL9jVzuh4AOg@public.gmane.org> 2009-05-11 12:49 ` Vivek Goyal 2009-05-11 12:49 ` Vivek Goyal 2009-05-08 14:24 ` Rik van Riel 2009-05-08 14:24 ` Rik van Riel [not found] ` <4A0440B2.7040300-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-11 10:11 ` Ryo Tsuruta 2009-05-11 10:11 ` Ryo Tsuruta [not found] ` <20090505132441.1705bfad.akpm-de/tnXTf+JLsfHDXvbKv3WD2FQJk+8+b@public.gmane.org> 2009-05-05 22:20 ` Peter Zijlstra 2009-05-06 2:33 ` Vivek Goyal 2009-05-06 3:41 ` Balbir Singh 2009-05-06 3:41 ` Balbir Singh 2009-05-06 13:28 ` Vivek Goyal 2009-05-06 13:28 ` Vivek Goyal [not found] ` <20090506034118.GC4416-SINUvgVNF2CyUtPGxGje5AC/G2K4zDHf@public.gmane.org> 2009-05-06 13:28 ` Vivek Goyal 2009-05-06 8:11 ` Gui Jianfeng 2009-05-06 8:11 ` Gui Jianfeng [not found] ` <4A014619.1040000-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-05-06 16:10 ` Vivek Goyal 2009-05-06 16:10 ` Vivek Goyal 2009-05-07 5:36 ` Li Zefan [not found] ` <4A027348.6000808-BthXqXjhjHXQFUHtdCDX3A@public.gmane.org> 2009-05-08 13:37 ` Vivek Goyal 2009-05-08 13:37 ` Vivek Goyal 2009-05-11 2:59 ` Gui Jianfeng [not found] ` <20090508133740.GD7293-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-11 2:59 ` Gui Jianfeng 2009-05-07 5:47 ` Gui Jianfeng [not found] ` <20090506161012.GC8180-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org> 2009-05-07 5:36 ` Li Zefan 2009-05-07 5:47 ` Gui Jianfeng 2009-05-05 19:58 Vivek Goyal
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.