* Question about switch_mm function @ 2015-01-28 16:26 Sreejith M M 2015-03-25 13:30 ` Sreejith M M 0 siblings, 1 reply; 10+ messages in thread From: Sreejith M M @ 2015-01-28 16:26 UTC (permalink / raw) To: kernelnewbies Hi, I was trying to understand the difference in scheduling between processes and threads(belong to same process). I was thinking that, when kernel has to switch to a task which belong to the same process, it does not have to clear / replace page global directories and other memory related information. But in switch_mm function some code is put under CONFIG_SMP function. What is its signigicance? Code is below(http://lxr.free-electrons.com/source/arch/x86/include/asm/mmu_context.h#L37) . What I infer is that the code is doing flush tlb, reload page table directories etc in multiprocessor mode(obviously) but I believe this code may never be executed . Can anyone help to understand what this part of the function supposed to do? 60 #ifdef CONFIG_SMP 61 else { 62 this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK); 63 BUG_ON(this_cpu_read(cpu_tlbstate.active_mm) != next); 64 65 if (!cpumask_test_cpu(cpu, mm_cpumask(next))) { 66 /* 67 * On established mms, the mm_cpumask is only changed 68 * from irq context, from ptep_clear_flush() while in 69 * lazy tlb mode, and here. Irqs are blocked during 70 * schedule, protecting us from simultaneous changes. 71 */ 72 cpumask_set_cpu(cpu, mm_cpumask(next)); 73 /* 74 * We were in lazy tlb mode and leave_mm disabled 75 * tlb flush IPI delivery. We must reload CR3 76 * to make sure to use no freed page tables. 77 */ 78 load_cr3(next->pgd); 79 trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL); 80 load_LDT_nolock(&next->context); 81 } 82 } 83 #endif -- Regards, Sreejith ^ permalink raw reply [flat|nested] 10+ messages in thread
* Question about switch_mm function 2015-01-28 16:26 Question about switch_mm function Sreejith M M @ 2015-03-25 13:30 ` Sreejith M M 2015-03-25 16:00 ` Rajat Sharma 0 siblings, 1 reply; 10+ messages in thread From: Sreejith M M @ 2015-03-25 13:30 UTC (permalink / raw) To: kernelnewbies On Wed, Jan 28, 2015 at 9:56 PM, Sreejith M M <sreejith.mm@gmail.com> wrote: > Hi, > > I was trying to understand the difference in scheduling between > processes and threads(belong to same process). > > I was thinking that, when kernel has to switch to a task which belong > to the same process, it does not have to clear / replace page global > directories and other memory related information. > > But in switch_mm function some code is put under CONFIG_SMP function. > What is its signigicance? Code is > below( > http://lxr.free-electrons.com/source/arch/x86/include/asm/mmu_context.h#L37 > ) > . > What I infer is that the code is doing flush tlb, reload page table > directories etc in multiprocessor mode(obviously) but I believe this > code may never be executed . > > Can anyone help to understand what this part of the function supposed to > do? > > 60 #ifdef CONFIG_SMP > 61 else { > 62 this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK); > 63 BUG_ON(this_cpu_read(cpu_tlbstate.active_mm) != next); > 64 > 65 if (!cpumask_test_cpu(cpu, mm_cpumask(next))) { > 66 /* > 67 * On established mms, the mm_cpumask is > only changed > 68 * from irq context, from > ptep_clear_flush() while in > 69 * lazy tlb mode, and here. Irqs are blocked > during > 70 * schedule, protecting us from > simultaneous changes. > 71 */ > 72 cpumask_set_cpu(cpu, mm_cpumask(next)); > 73 /* > 74 * We were in lazy tlb mode and leave_mm > disabled > 75 * tlb flush IPI delivery. We must reload CR3 > 76 * to make sure to use no freed page tables. > 77 */ > 78 load_cr3(next->pgd); > 79 trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, > TLB_FLUSH_ALL); > 80 load_LDT_nolock(&next->context); > 81 } > 82 } > 83 #endif > > > -- > Regards, > Sreejith > Hi , can someone please give me any answers for this? -- Regards, Sreejith -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150325/9706b23b/attachment.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Question about switch_mm function 2015-03-25 13:30 ` Sreejith M M @ 2015-03-25 16:00 ` Rajat Sharma 2015-03-25 16:05 ` Sreejith M M 0 siblings, 1 reply; 10+ messages in thread From: Rajat Sharma @ 2015-03-25 16:00 UTC (permalink / raw) To: kernelnewbies On Mar 25, 2015 6:33 AM, "Sreejith M M" <sreejith.mm@gmail.com> wrote: > > > > On Wed, Jan 28, 2015 at 9:56 PM, Sreejith M M <sreejith.mm@gmail.com> wrote: >> >> Hi, >> >> I was trying to understand the difference in scheduling between >> processes and threads(belong to same process). >> >> I was thinking that, when kernel has to switch to a task which belong >> to the same process, it does not have to clear / replace page global >> directories and other memory related information. >> >> But in switch_mm function some code is put under CONFIG_SMP function. >> What is its signigicance? Code is >> below( http://lxr.free-electrons.com/source/arch/x86/include/asm/mmu_context.h#L37) >> . >> What I infer is that the code is doing flush tlb, reload page table >> directories etc in multiprocessor mode(obviously) but I believe this >> code may never be executed . >> >> Can anyone help to understand what this part of the function supposed to do? >> >> 60 #ifdef CONFIG_SMP >> 61 else { >> 62 this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK); >> 63 BUG_ON(this_cpu_read(cpu_tlbstate.active_mm) != next); >> 64 >> 65 if (!cpumask_test_cpu(cpu, mm_cpumask(next))) { >> 66 /* >> 67 * On established mms, the mm_cpumask is >> only changed >> 68 * from irq context, from >> ptep_clear_flush() while in >> 69 * lazy tlb mode, and here. Irqs are blocked during >> 70 * schedule, protecting us from >> simultaneous changes. >> 71 */ >> 72 cpumask_set_cpu(cpu, mm_cpumask(next)); >> 73 /* >> 74 * We were in lazy tlb mode and leave_mm disabled >> 75 * tlb flush IPI delivery. We must reload CR3 >> 76 * to make sure to use no freed page tables. >> 77 */ >> 78 load_cr3(next->pgd); >> 79 trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, >> TLB_FLUSH_ALL); >> 80 load_LDT_nolock(&next->context); >> 81 } >> 82 } >> 83 #endif >> >> >> -- >> Regards, >> Sreejith > > > Hi , > > can someone please give me any answers for this? > > -- > Regards, > Sreejith > > _______________________________________________ > Kernelnewbies mailing list > Kernelnewbies at kernelnewbies.org > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > This code is handling context switch from a kernel thread back to user mode thread so TLB entries are invalid translation for user mode thread and do not correspond to user process pgd. It is Master kernel page table translation as a result of kernel thread execution. -Rajat -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150325/38e6788b/attachment.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Question about switch_mm function 2015-03-25 16:00 ` Rajat Sharma @ 2015-03-25 16:05 ` Sreejith M M 2015-03-25 17:25 ` Valdis.Kletnieks at vt.edu 0 siblings, 1 reply; 10+ messages in thread From: Sreejith M M @ 2015-03-25 16:05 UTC (permalink / raw) To: kernelnewbies On Wed, Mar 25, 2015 at 9:30 PM, Rajat Sharma <fs.rajat@gmail.com> wrote: > > On Mar 25, 2015 6:33 AM, "Sreejith M M" <sreejith.mm@gmail.com> wrote: >> >> >> >> On Wed, Jan 28, 2015 at 9:56 PM, Sreejith M M <sreejith.mm@gmail.com> >> wrote: >>> >>> Hi, >>> >>> I was trying to understand the difference in scheduling between >>> processes and threads(belong to same process). >>> >>> I was thinking that, when kernel has to switch to a task which belong >>> to the same process, it does not have to clear / replace page global >>> directories and other memory related information. >>> >>> But in switch_mm function some code is put under CONFIG_SMP function. >>> What is its signigicance? Code is >>> >>> below(http://lxr.free-electrons.com/source/arch/x86/include/asm/mmu_context.h#L37) >>> . >>> What I infer is that the code is doing flush tlb, reload page table >>> directories etc in multiprocessor mode(obviously) but I believe this >>> code may never be executed . >>> >>> Can anyone help to understand what this part of the function supposed to >>> do? >>> >>> 60 #ifdef CONFIG_SMP >>> 61 else { >>> 62 this_cpu_write(cpu_tlbstate.state, TLBSTATE_OK); >>> 63 BUG_ON(this_cpu_read(cpu_tlbstate.active_mm) != >>> next); >>> 64 >>> 65 if (!cpumask_test_cpu(cpu, mm_cpumask(next))) { >>> 66 /* >>> 67 * On established mms, the mm_cpumask is >>> only changed >>> 68 * from irq context, from >>> ptep_clear_flush() while in >>> 69 * lazy tlb mode, and here. Irqs are blocked >>> during >>> 70 * schedule, protecting us from >>> simultaneous changes. >>> 71 */ >>> 72 cpumask_set_cpu(cpu, mm_cpumask(next)); >>> 73 /* >>> 74 * We were in lazy tlb mode and leave_mm >>> disabled >>> 75 * tlb flush IPI delivery. We must reload CR3 >>> 76 * to make sure to use no freed page tables. >>> 77 */ >>> 78 load_cr3(next->pgd); >>> 79 trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, >>> TLB_FLUSH_ALL); >>> 80 load_LDT_nolock(&next->context); >>> 81 } >>> 82 } >>> 83 #endif >>> >>> >>> -- >>> Regards, >>> Sreejith >> >> >> Hi , >> >> can someone please give me any answers for this? >> >> -- >> Regards, >> Sreejith >> >> _______________________________________________ >> Kernelnewbies mailing list >> Kernelnewbies at kernelnewbies.org >> http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies >> > > This code is handling context switch from a kernel thread back to user mode > thread so TLB entries are invalid translation for user mode thread and do > not correspond to user process pgd. It is Master kernel page table > translation as a result of kernel thread execution. > > -Rajat Hi Rajat, If that is the case, why this code is put under CONFIG_SMP switch? -- Regards, Sreejith ^ permalink raw reply [flat|nested] 10+ messages in thread
* Question about switch_mm function 2015-03-25 16:05 ` Sreejith M M @ 2015-03-25 17:25 ` Valdis.Kletnieks at vt.edu 2015-03-25 17:31 ` Sreejith M M 0 siblings, 1 reply; 10+ messages in thread From: Valdis.Kletnieks at vt.edu @ 2015-03-25 17:25 UTC (permalink / raw) To: kernelnewbies On Wed, 25 Mar 2015 21:35:22 +0530, Sreejith M M said: > > This code is handling context switch from a kernel thread back to user mode > > thread so TLB entries are invalid translation for user mode thread and do > > not correspond to user process pgd. It is Master kernel page table > > translation as a result of kernel thread execution. > > > > -Rajat > Hi Rajat, > > If that is the case, why this code is put under CONFIG_SMP switch? Vastly simplified because I'm lazy :) If you look at the code, it's poking the status on *other* CPUs. That's why the cpumask() stuff. If you're on a single execution unit, you don't have to tell the other CPU about the change in state, because there isn't an other CPU. -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 848 bytes Desc: not available Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150325/94977dc2/attachment.bin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Question about switch_mm function 2015-03-25 17:25 ` Valdis.Kletnieks at vt.edu @ 2015-03-25 17:31 ` Sreejith M M 2015-03-25 17:33 ` Rajat Sharma 0 siblings, 1 reply; 10+ messages in thread From: Sreejith M M @ 2015-03-25 17:31 UTC (permalink / raw) To: kernelnewbies On Wed, Mar 25, 2015 at 10:55 PM, <Valdis.Kletnieks@vt.edu> wrote: > On Wed, 25 Mar 2015 21:35:22 +0530, Sreejith M M said: > >> > This code is handling context switch from a kernel thread back to user mode >> > thread so TLB entries are invalid translation for user mode thread and do >> > not correspond to user process pgd. It is Master kernel page table >> > translation as a result of kernel thread execution. >> > >> > -Rajat >> Hi Rajat, >> >> If that is the case, why this code is put under CONFIG_SMP switch? > > Vastly simplified because I'm lazy :) > > If you look at the code, it's poking the status on *other* CPUs. That's why > the cpumask() stuff. > > If you're on a single execution unit, you don't have to tell the other > CPU about the change in state, because there isn't an other CPU. can you come out of this lazy mode explain this a bit more because I am a newbie ?or tell me what else I should know before I have to understand this code -- Regards, Sreejith ^ permalink raw reply [flat|nested] 10+ messages in thread
* Question about switch_mm function 2015-03-25 17:31 ` Sreejith M M @ 2015-03-25 17:33 ` Rajat Sharma 2015-03-25 19:13 ` Rajat Sharma 0 siblings, 1 reply; 10+ messages in thread From: Rajat Sharma @ 2015-03-25 17:33 UTC (permalink / raw) To: kernelnewbies On Mar 25, 2015 10:31 AM, "Sreejith M M" <sreejith.mm@gmail.com> wrote: > > On Wed, Mar 25, 2015 at 10:55 PM, <Valdis.Kletnieks@vt.edu> wrote: > > On Wed, 25 Mar 2015 21:35:22 +0530, Sreejith M M said: > > > >> > This code is handling context switch from a kernel thread back to user mode > >> > thread so TLB entries are invalid translation for user mode thread and do > >> > not correspond to user process pgd. It is Master kernel page table > >> > translation as a result of kernel thread execution. > >> > > >> > -Rajat > >> Hi Rajat, > >> > >> If that is the case, why this code is put under CONFIG_SMP switch? > > > > Vastly simplified because I'm lazy :) > > > > If you look at the code, it's poking the status on *other* CPUs. That's why > > the cpumask() stuff. > > > > If you're on a single execution unit, you don't have to tell the other > > CPU about the change in state, because there isn't an other CPU. > > can you come out of this lazy mode explain this a bit more because I > am a newbie ?or tell me what else I should know before I have to > understand this code > > -- > Regards, > Sreejith Valdis is talking about lazy tlb flush, not him being lazy. Otherwise he wouldn't have replied at all :) -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150325/90c7a184/attachment-0001.html ^ permalink raw reply [flat|nested] 10+ messages in thread
* Question about switch_mm function 2015-03-25 17:33 ` Rajat Sharma @ 2015-03-25 19:13 ` Rajat Sharma 2015-03-25 19:25 ` Valdis.Kletnieks at vt.edu 0 siblings, 1 reply; 10+ messages in thread From: Rajat Sharma @ 2015-03-25 19:13 UTC (permalink / raw) To: kernelnewbies On Wed, Mar 25, 2015 at 10:33 AM, Rajat Sharma <fs.rajat@gmail.com> wrote: > > > On Mar 25, 2015 10:31 AM, "Sreejith M M" <sreejith.mm@gmail.com> wrote: > > > > On Wed, Mar 25, 2015 at 10:55 PM, <Valdis.Kletnieks@vt.edu> wrote: > > > On Wed, 25 Mar 2015 21:35:22 +0530, Sreejith M M said: > > > > > >> > This code is handling context switch from a kernel thread back to user mode > > >> > thread so TLB entries are invalid translation for user mode thread and do > > >> > not correspond to user process pgd. It is Master kernel page table > > >> > translation as a result of kernel thread execution. > > >> > > > >> > -Rajat > > >> Hi Rajat, > > >> > > >> If that is the case, why this code is put under CONFIG_SMP switch? > > > > > > Vastly simplified because I'm lazy :) > > > > > > If you look at the code, it's poking the status on *other* CPUs. That's why > > > the cpumask() stuff. > > > > > > If you're on a single execution unit, you don't have to tell the other > > > CPU about the change in state, because there isn't an other CPU. > > > > can you come out of this lazy mode explain this a bit more because I > > am a newbie ?or tell me what else I should know before I have to > > understand this code > > > > -- > > Regards, > > Sreejith > > Valdis is talking about lazy tlb flush, not him being lazy. Otherwise he wouldn't have replied at all :) Okay bit more details, I admit I had to dig through bit more to find this out. After all, we all are newbies :) On SMP system, there is an optimization called lazy TLB mode for kernel threads. Follow the steps: 1. Assume that some of the CPU are executing a multithreaded user mode application so essentially they all share same mm and page tables. 2. Now lets say some other CPU changes/assigns physical page frame to user mode linear address, tets say as a result of processing a system call on behalf of user mode process. Putting data in user mode buffer etc. It needs to invalidate old TLB entry for this linear address in local page table. 3. Since application is multithreaded, some other CPU sharing the same page table will have old values for corresponding linear address in its TLB. 4. Normally we would invalidate TLB entries of all CPUs sharing this page table. 5. Now suppose some of the participating CPUs were running a kernel thread and does not want to be bothered about this change as it has nothing to do with user mode pages TLB entries, it makes its executing CPU with do not disturb mode called lazy TLB mode. 6. TLB invalidation of all CPU executing kernel thread are deferred till kernel thread is finished. 7. At this point, when kernel thread switches back to user mode process, the invalidation is done and is the code which are are referring to. Just in case, if you wonder where is invalidation happening, so invalidation is arch specific step. In most simple way it is flush all TLB entries and let it build up over a period of time in future. That's why it is costly and optimization like lazy TLB mode pays off. how it is done in x86 is by loading cr3. http://stackoverflow.com/questions/1090218/what-does-this-little-bit-of-x86-doing-with-cr3 -Rajat ^ permalink raw reply [flat|nested] 10+ messages in thread
* Question about switch_mm function 2015-03-25 19:13 ` Rajat Sharma @ 2015-03-25 19:25 ` Valdis.Kletnieks at vt.edu 2015-03-25 19:39 ` Rajat Sharma 0 siblings, 1 reply; 10+ messages in thread From: Valdis.Kletnieks at vt.edu @ 2015-03-25 19:25 UTC (permalink / raw) To: kernelnewbies On Wed, 25 Mar 2015 12:13:55 -0700, Rajat Sharma said: > Okay bit more details, I admit I had to dig through bit more to find > this out. After all, we all are newbies :) And you probably learned 3 times more while digging than if I had spelled it out for you :) -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 848 bytes Desc: not available Url : http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150325/52d272b4/attachment.bin ^ permalink raw reply [flat|nested] 10+ messages in thread
* Question about switch_mm function 2015-03-25 19:25 ` Valdis.Kletnieks at vt.edu @ 2015-03-25 19:39 ` Rajat Sharma 0 siblings, 0 replies; 10+ messages in thread From: Rajat Sharma @ 2015-03-25 19:39 UTC (permalink / raw) To: kernelnewbies On Mar 25, 2015 12:26 PM, <Valdis.Kletnieks@vt.edu> wrote: > > On Wed, 25 Mar 2015 12:13:55 -0700, Rajat Sharma said: > > > Okay bit more details, I admit I had to dig through bit more to find > > this out. After all, we all are newbies :) > > And you probably learned 3 times more while digging than if I had spelled it > out for you :) > > Completely agree :) _______________________________________________ > Kernelnewbies mailing list > Kernelnewbies at kernelnewbies.org > http://lists.kernelnewbies.org/mailman/listinfo/kernelnewbies > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://lists.kernelnewbies.org/pipermail/kernelnewbies/attachments/20150325/988c3619/attachment.html ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2015-03-25 19:39 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2015-01-28 16:26 Question about switch_mm function Sreejith M M 2015-03-25 13:30 ` Sreejith M M 2015-03-25 16:00 ` Rajat Sharma 2015-03-25 16:05 ` Sreejith M M 2015-03-25 17:25 ` Valdis.Kletnieks at vt.edu 2015-03-25 17:31 ` Sreejith M M 2015-03-25 17:33 ` Rajat Sharma 2015-03-25 19:13 ` Rajat Sharma 2015-03-25 19:25 ` Valdis.Kletnieks at vt.edu 2015-03-25 19:39 ` Rajat Sharma
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.