All of lore.kernel.org
 help / color / mirror / Atom feed
* [linux-3.18 bisection] complete test-amd64-amd64-xl-qemut-win7-amd64
@ 2016-12-14  8:34 osstest service owner
  2016-12-14 13:29   ` Ian Jackson
  0 siblings, 1 reply; 13+ messages in thread
From: osstest service owner @ 2016-12-14  8:34 UTC (permalink / raw)
  To: xen-devel, osstest-admin

branch xen-unstable
xenbranch xen-unstable
job test-amd64-amd64-xl-qemut-win7-amd64
testid xen-boot

Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git

*** Found and reproduced problem changeset ***

  Bug is in tree:  linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  Bug introduced:  a2d8c514753276394d68414f563591f174ef86cb
  Bug not present: 8f620446135b64ca6f96cf32066a76d64e79a388
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/


  commit a2d8c514753276394d68414f563591f174ef86cb
  Author: Lukasz Odzioba <lukasz.odzioba@intel.com>
  Date:   Fri Jun 24 14:50:01 2016 -0700
  
      mm/swap.c: flush lru pvecs on compound page arrival
      
      [ Upstream commit 8f182270dfec432e93fae14f9208a6b9af01009f ]
      
      Currently we can have compound pages held on per cpu pagevecs, which
      leads to a lot of memory unavailable for reclaim when needed.  In the
      systems with hundreads of processors it can be GBs of memory.
      
      On of the way of reproducing the problem is to not call munmap
      explicitly on all mapped regions (i.e.  after receiving SIGTERM).  After
      that some pages (with THP enabled also huge pages) may end up on
      lru_add_pvec, example below.
      
        void main() {
        #pragma omp parallel
        {
      	size_t size = 55 * 1000 * 1000; // smaller than  MEM/CPUS
      	void *p = mmap(NULL, size, PROT_READ | PROT_WRITE,
      		MAP_PRIVATE | MAP_ANONYMOUS , -1, 0);
      	if (p != MAP_FAILED)
      		memset(p, 0, size);
      	//munmap(p, size); // uncomment to make the problem go away
        }
        }
      
      When we run it with THP enabled it will leave significant amount of
      memory on lru_add_pvec.  This memory will be not reclaimed if we hit
      OOM, so when we run above program in a loop:
      
      	for i in `seq 100`; do ./a.out; done
      
      many processes (95% in my case) will be killed by OOM.
      
      The primary point of the LRU add cache is to save the zone lru_lock
      contention with a hope that more pages will belong to the same zone and
      so their addition can be batched.  The huge page is already a form of
      batched addition (it will add 512 worth of memory in one go) so skipping
      the batching seems like a safer option when compared to a potential
      excess in the caching which can be quite large and much harder to fix
      because lru_add_drain_all is way to expensive and it is not really clear
      what would be a good moment to call it.
      
      Similarly we can reproduce the problem on lru_deactivate_pvec by adding:
      madvise(p, size, MADV_FREE); after memset.
      
      This patch flushes lru pvecs on compound page arrival making the problem
      less severe - after applying it kill rate of above example drops to 0%,
      due to reducing maximum amount of memory held on pvec from 28MB (with
      THP) to 56kB per CPU.
      
      Suggested-by: Michal Hocko <mhocko@suse.com>
      Link: http://lkml.kernel.org/r/1466180198-18854-1-git-send-email-lukasz.odzioba@intel.com
      Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
      Acked-by: Michal Hocko <mhocko@suse.com>
      Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Ming Li <mingli199x@qq.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: Sasha Levin <sasha.levin@oracle.com>


For bisection revision-tuple graph see:
   http://logs.test-lab.xenproject.org/osstest/results/bisect/linux-3.18/test-amd64-amd64-xl-qemut-win7-amd64.xen-boot.html
Revision IDs in each graph node refer, respectively, to the Trees above.

----------------------------------------
Running cs-bisection-step --graph-out=/home/logs/results/bisect/linux-3.18/test-amd64-amd64-xl-qemut-win7-amd64.xen-boot --summary-out=tmp/103315.bisection-summary --basis-template=101675 --blessings=real,real-bisect linux-3.18 test-amd64-amd64-xl-qemut-win7-amd64 xen-boot
Searching for failure / basis pass:
 103169 fail [host=elbling1] / 101675 [host=elbling0] 101662 [host=elbling0] 101648 [host=elbling0] 101637 [host=elbling0] 101623 [host=elbling0] 101603 [host=elbling0] 101584 [host=elbling0] 101570 [host=elbling0] 101561 [host=elbling0] 101552 [host=elbling0] 101541 [host=elbling0] 101532 [host=elbling0] 101515 [host=elbling0] 101497 [host=elbling0] 101493 [host=elbling0] 101487 [host=elbling0] 101483 [host=elbling0] 101480 [host=elbling0] 101476 [host=elbling0] 101470 [host=elbling0] 101460 [host=elbling0] 101434 [host=elbling0] 101424 [host=elbling0] 101413 [host=elbling0] 101398 [host=elbling0] 101389 [host=elbling0] 101000 [host=elbling0] 100758 [host=elbling0] 100752 [host=elbling0] 100597 [host=elbling0] 100588 [host=elbling0] 100385 [host=elbling0] 100372 [host=elbling0] 99832 [host=elbling0] 96188 [host=elbling0] 96161 [host=elbling0] 95844 [host=elbling0] 95809 [host=elbling0] 95597 [host=elbling0] 95521 [host=elbling0] 95458 [host=elbling0] 95406 [host=elbling0] 94728 ok.
Failure / basis pass flights: 103169 / 94728
(tree with no url: minios)
(tree with no url: ovmf)
(tree with no url: seabios)
Tree: linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
Tree: linuxfirmware git://xenbits.xen.org/osstest/linux-firmware.git
Tree: qemu git://xenbits.xen.org/qemu-xen-traditional.git
Tree: qemuu git://xenbits.xen.org/qemu-xen.git
Tree: xen git://xenbits.xen.org/xen.git
Latest ac3d826bef907afe35f80ecccbcdd57223df4b88 c530a75c1e6a472b0eb9558310b518f0dfcd8860 89c4cbe8d234049b0145e4dc5e5d19d626250b57 4220231eb22235e757d269722b9f6a594fbcb70f 8e4b2676685f50bc26f03b5f62d8b7aea8e69dbf
Basis pass 3b6aa07b936b09d38c1bfcee1e06845b968df475 c530a75c1e6a472b0eb9558310b518f0dfcd8860 df553c056104e3dd8a2bd2e72539a57c4c085bae 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
Generating revisions with ./adhoc-revtuple-generator  git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git#3b6aa07b936b09d38c1bfcee1e06845b968df475-ac3d826bef907afe35f80ecccbcdd57223df4b88 git://xenbits.xen.org/osstest/linux-firmware.git#c530a75c1e6a472b0eb9558310b518f0dfcd8860-c530a75c1e6a472b0eb9558310b518f0dfcd8860 git://xenbits.xen.org/qemu-xen-traditional.git#df553c056104e3dd8a2bd2e72539a57c4c085bae-89c4cbe8d234049b0145e4dc5e5d19d626250b57 git://xenbits.xen.org/qemu-xen.git#62b3d206425c245ed0a020390a64640d40d97471-4220231eb22235e757d269722b9f6a594fbcb70f git://xenbits.xen.org/xen.git#bab2bd8e222de9e596699ac080ea985af828c4c4-8e4b2676685f50bc26f03b5f62d8b7aea8e69dbf
adhoc-revtuple-generator: tree discontiguous: qemu-xen
adhoc-revtuple-generator: tree discontiguous: xen
Loaded 2006 nodes in revision graph
Searching for test results:
 94728 pass 3b6aa07b936b09d38c1bfcee1e06845b968df475 c530a75c1e6a472b0eb9558310b518f0dfcd8860 df553c056104e3dd8a2bd2e72539a57c4c085bae 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 95406 [host=elbling0]
 95458 [host=elbling0]
 95521 [host=elbling0]
 95597 [host=elbling0]
 95809 [host=elbling0]
 95844 [host=elbling0]
 96161 [host=elbling0]
 96188 [host=elbling0]
 97278 [host=elbling0]
 97289 [host=elbling0]
 97319 [host=elbling0]
 97377 [host=elbling0]
 97426 [host=elbling0]
 97533 [host=elbling0]
 97682 [host=elbling0]
 97592 [host=elbling0]
 97637 [host=elbling0]
 97724 [host=elbling0]
 99656 [host=elbling0]
 99603 []
 99698 [host=elbling0]
 99718 [host=elbling0]
 99766 [host=elbling0]
 99832 [host=elbling0]
 100385 [host=elbling0]
 100372 [host=elbling0]
 100588 [host=elbling0]
 100597 [host=elbling0]
 100758 [host=elbling0]
 100752 [host=elbling0]
 101000 [host=elbling0]
 101389 [host=elbling0]
 101424 [host=elbling0]
 101413 [host=elbling0]
 101434 [host=elbling0]
 101398 [host=elbling0]
 101460 [host=elbling0]
 101466 []
 101470 [host=elbling0]
 101476 [host=elbling0]
 101480 [host=elbling0]
 101487 [host=elbling0]
 101483 [host=elbling0]
 101493 [host=elbling0]
 101532 [host=elbling0]
 101515 [host=elbling0]
 101497 [host=elbling0]
 101552 [host=elbling0]
 101541 [host=elbling0]
 101561 [host=elbling0]
 101570 [host=elbling0]
 101584 [host=elbling0]
 101603 [host=elbling0]
 101648 [host=elbling0]
 101637 [host=elbling0]
 101623 [host=elbling0]
 101662 [host=elbling0]
 101675 [host=elbling0]
 102732 fail irrelevant
 102754 fail irrelevant
 102773 fail irrelevant
 102823 fail irrelevant
 102875 fail irrelevant
 102974 fail ac3d826bef907afe35f80ecccbcdd57223df4b88 c530a75c1e6a472b0eb9558310b518f0dfcd8860 89c4cbe8d234049b0145e4dc5e5d19d626250b57 4220231eb22235e757d269722b9f6a594fbcb70f 8e4b2676685f50bc26f03b5f62d8b7aea8e69dbf
 102920 fail ac3d826bef907afe35f80ecccbcdd57223df4b88 c530a75c1e6a472b0eb9558310b518f0dfcd8860 89c4cbe8d234049b0145e4dc5e5d19d626250b57 4220231eb22235e757d269722b9f6a594fbcb70f 8e4b2676685f50bc26f03b5f62d8b7aea8e69dbf
 103074 fail ac3d826bef907afe35f80ecccbcdd57223df4b88 c530a75c1e6a472b0eb9558310b518f0dfcd8860 89c4cbe8d234049b0145e4dc5e5d19d626250b57 4220231eb22235e757d269722b9f6a594fbcb70f 8e4b2676685f50bc26f03b5f62d8b7aea8e69dbf
 103169 fail ac3d826bef907afe35f80ecccbcdd57223df4b88 c530a75c1e6a472b0eb9558310b518f0dfcd8860 89c4cbe8d234049b0145e4dc5e5d19d626250b57 4220231eb22235e757d269722b9f6a594fbcb70f 8e4b2676685f50bc26f03b5f62d8b7aea8e69dbf
 103242 fail dda467549d44f4d39b4420511ddcf77d29e9b6b2 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103259 pass 59b520454b323ec43b2ae757217332cea33091e0 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103252 fail 4aaf33222ca629d65ebd91a5b1755f66349d3626 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103245 pass cc7a8303741c5fd9a0c733d69d4fa3a154cbd707 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103234 pass 3b6aa07b936b09d38c1bfcee1e06845b968df475 c530a75c1e6a472b0eb9558310b518f0dfcd8860 df553c056104e3dd8a2bd2e72539a57c4c085bae 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103249 fail 1497c0db632dd3106687f13c43d4055ad7fc2531 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103239 fail ac3d826bef907afe35f80ecccbcdd57223df4b88 c530a75c1e6a472b0eb9558310b518f0dfcd8860 89c4cbe8d234049b0145e4dc5e5d19d626250b57 4220231eb22235e757d269722b9f6a594fbcb70f 8e4b2676685f50bc26f03b5f62d8b7aea8e69dbf
 103275 fail a2d8c514753276394d68414f563591f174ef86cb c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103281 pass 4c2b0216cdf54e81f7c0e841b5bb1116701ae25b c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103269 fail e9393f71fe6bb76aebb272dd8d8ce366c88d754c c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103285 pass faa35ed7c7dd74a62bb58340e0ba1819ec33e4e1 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103290 pass 8f620446135b64ca6f96cf32066a76d64e79a388 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103295 fail a2d8c514753276394d68414f563591f174ef86cb c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103299 pass 8f620446135b64ca6f96cf32066a76d64e79a388 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103303 fail a2d8c514753276394d68414f563591f174ef86cb c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103307 pass 8f620446135b64ca6f96cf32066a76d64e79a388 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
 103315 fail a2d8c514753276394d68414f563591f174ef86cb c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
Searching for interesting versions
 Result found: flight 94728 (pass), for basis pass
 Result found: flight 102920 (fail), for basis failure
 Repro found: flight 103234 (pass), for basis pass
 Repro found: flight 103239 (fail), for basis failure
 0 revisions at 8f620446135b64ca6f96cf32066a76d64e79a388 c530a75c1e6a472b0eb9558310b518f0dfcd8860 6e20809727261599e8527c456eb078c0e89139a1 62b3d206425c245ed0a020390a64640d40d97471 bab2bd8e222de9e596699ac080ea985af828c4c4
No revisions left to test, checking graph state.
 Result found: flight 103290 (pass), for last pass
 Result found: flight 103295 (fail), for first failure
 Repro found: flight 103299 (pass), for last pass
 Repro found: flight 103303 (fail), for first failure
 Repro found: flight 103307 (pass), for last pass
 Repro found: flight 103315 (fail), for first failure

*** Found and reproduced problem changeset ***

  Bug is in tree:  linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  Bug introduced:  a2d8c514753276394d68414f563591f174ef86cb
  Bug not present: 8f620446135b64ca6f96cf32066a76d64e79a388
  Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/


  commit a2d8c514753276394d68414f563591f174ef86cb
  Author: Lukasz Odzioba <lukasz.odzioba@intel.com>
  Date:   Fri Jun 24 14:50:01 2016 -0700
  
      mm/swap.c: flush lru pvecs on compound page arrival
      
      [ Upstream commit 8f182270dfec432e93fae14f9208a6b9af01009f ]
      
      Currently we can have compound pages held on per cpu pagevecs, which
      leads to a lot of memory unavailable for reclaim when needed.  In the
      systems with hundreads of processors it can be GBs of memory.
      
      On of the way of reproducing the problem is to not call munmap
      explicitly on all mapped regions (i.e.  after receiving SIGTERM).  After
      that some pages (with THP enabled also huge pages) may end up on
      lru_add_pvec, example below.
      
        void main() {
        #pragma omp parallel
        {
      	size_t size = 55 * 1000 * 1000; // smaller than  MEM/CPUS
      	void *p = mmap(NULL, size, PROT_READ | PROT_WRITE,
      		MAP_PRIVATE | MAP_ANONYMOUS , -1, 0);
      	if (p != MAP_FAILED)
      		memset(p, 0, size);
      	//munmap(p, size); // uncomment to make the problem go away
        }
        }
      
      When we run it with THP enabled it will leave significant amount of
      memory on lru_add_pvec.  This memory will be not reclaimed if we hit
      OOM, so when we run above program in a loop:
      
      	for i in `seq 100`; do ./a.out; done
      
      many processes (95% in my case) will be killed by OOM.
      
      The primary point of the LRU add cache is to save the zone lru_lock
      contention with a hope that more pages will belong to the same zone and
      so their addition can be batched.  The huge page is already a form of
      batched addition (it will add 512 worth of memory in one go) so skipping
      the batching seems like a safer option when compared to a potential
      excess in the caching which can be quite large and much harder to fix
      because lru_add_drain_all is way to expensive and it is not really clear
      what would be a good moment to call it.
      
      Similarly we can reproduce the problem on lru_deactivate_pvec by adding:
      madvise(p, size, MADV_FREE); after memset.
      
      This patch flushes lru pvecs on compound page arrival making the problem
      less severe - after applying it kill rate of above example drops to 0%,
      due to reducing maximum amount of memory held on pvec from 28MB (with
      THP) to 56kB per CPU.
      
      Suggested-by: Michal Hocko <mhocko@suse.com>
      Link: http://lkml.kernel.org/r/1466180198-18854-1-git-send-email-lukasz.odzioba@intel.com
      Signed-off-by: Lukasz Odzioba <lukasz.odzioba@intel.com>
      Acked-by: Michal Hocko <mhocko@suse.com>
      Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
      Cc: Andrea Arcangeli <aarcange@redhat.com>
      Cc: Vladimir Davydov <vdavydov@parallels.com>
      Cc: Ming Li <mingli199x@qq.com>
      Cc: Minchan Kim <minchan@kernel.org>
      Cc: <stable@vger.kernel.org>
      Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
      Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
      Signed-off-by: Sasha Levin <sasha.levin@oracle.com>

dot: graph is too large for cairo-renderer bitmaps. Scaling by 0.791378 to fit
pnmtopng: 57 colors found
Revision graph left in /home/logs/results/bisect/linux-3.18/test-amd64-amd64-xl-qemut-win7-amd64.xen-boot.{dot,ps,png,html,svg}.
----------------------------------------
103315: tolerable ALL FAIL

flight 103315 linux-3.18 real-bisect [real]
http://logs.test-lab.xenproject.org/osstest/logs/103315/

Failures :-/ but no regressions.

Tests which did not succeed,
including tests which could not be run:
 test-amd64-amd64-xl-qemut-win7-amd64  6 xen-boot        fail baseline untested


jobs:
 test-amd64-amd64-xl-qemut-win7-amd64                         fail    


------------------------------------------------------------
sg-report-flight on osstest.test-lab.xenproject.org
logs: /home/logs/logs
images: /home/logs/images

Logs, config files, etc. are available at
    http://logs.test-lab.xenproject.org/osstest/logs

Explanation of these reports, and of osstest in general, is at
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README.email;hb=master
    http://xenbits.xen.org/gitweb/?p=osstest.git;a=blob;f=README;hb=master

Test harness code can be found at
    http://xenbits.xen.org/gitweb?p=osstest.git;a=summary


_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
  2016-12-14  8:34 [linux-3.18 bisection] complete test-amd64-amd64-xl-qemut-win7-amd64 osstest service owner
@ 2016-12-14 13:29   ` Ian Jackson
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Jackson @ 2016-12-14 13:29 UTC (permalink / raw)
  To: xen-devel, Michal Hocko, Lukasz Odzioba, Kirill Shutemov,
	Andrea Arcangeli, Vladimir Davydov, Ming Li, Minchan Kim, stable,
	Andrew Morton, Linus Torvalds, Sasha Levin

osstest service owner writes ("[linux-3.18 bisection] complete test-amd64-amd64-xl-qemut-win7-amd64"):
> *** Found and reproduced problem changeset ***
> 
>   Bug is in tree:  linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
>   Bug introduced:  a2d8c514753276394d68414f563591f174ef86cb
>   Bug not present: 8f620446135b64ca6f96cf32066a76d64e79a388
>   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/
> 
>   commit a2d8c514753276394d68414f563591f174ef86cb
>   Author: Lukasz Odzioba <lukasz.odzioba@intel.com>
>   Date:   Fri Jun 24 14:50:01 2016 -0700
>   
>       mm/swap.c: flush lru pvecs on compound page arrival

This commit breaks the test "test-amd64-amd64-xl-qemut-win7-amd64" in
the Xen Project CI system.

Ian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
@ 2016-12-14 13:29   ` Ian Jackson
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Jackson @ 2016-12-14 13:29 UTC (permalink / raw)
  To: xen-devel, Michal Hocko, Lukasz Odzioba, Kirill Shutemov,
	Andrea Arcangeli, Vladimir Davydov, Ming Li, Minchan Kim, stable,
	Andrew Morton, Linus Torvalds, Sasha Levin

osstest service owner writes ("[linux-3.18 bisection] complete test-amd64-amd64-xl-qemut-win7-amd64"):
> *** Found and reproduced problem changeset ***
> 
>   Bug is in tree:  linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
>   Bug introduced:  a2d8c514753276394d68414f563591f174ef86cb
>   Bug not present: 8f620446135b64ca6f96cf32066a76d64e79a388
>   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/
> 
>   commit a2d8c514753276394d68414f563591f174ef86cb
>   Author: Lukasz Odzioba <lukasz.odzioba@intel.com>
>   Date:   Fri Jun 24 14:50:01 2016 -0700
>   
>       mm/swap.c: flush lru pvecs on compound page arrival

This commit breaks the test "test-amd64-amd64-xl-qemut-win7-amd64" in
the Xen Project CI system.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
  2016-12-14 13:29   ` Ian Jackson
  (?)
@ 2016-12-14 13:39   ` Michal Hocko
  2016-12-14 15:14       ` Ian Jackson
  -1 siblings, 1 reply; 13+ messages in thread
From: Michal Hocko @ 2016-12-14 13:39 UTC (permalink / raw)
  To: Ian Jackson
  Cc: xen-devel, Lukasz Odzioba, Kirill Shutemov, Andrea Arcangeli,
	Vladimir Davydov, Ming Li, Minchan Kim, stable, Andrew Morton,
	Linus Torvalds, Sasha Levin

On Wed 14-12-16 13:29:56, Ian Jackson wrote:
> osstest service owner writes ("[linux-3.18 bisection] complete test-amd64-amd64-xl-qemut-win7-amd64"):
> > *** Found and reproduced problem changeset ***
> > 
> >   Bug is in tree:  linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
> >   Bug introduced:  a2d8c514753276394d68414f563591f174ef86cb
> >   Bug not present: 8f620446135b64ca6f96cf32066a76d64e79a388
> >   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/
> > 
> >   commit a2d8c514753276394d68414f563591f174ef86cb
> >   Author: Lukasz Odzioba <lukasz.odzioba@intel.com>
> >   Date:   Fri Jun 24 14:50:01 2016 -0700
> >   
> >       mm/swap.c: flush lru pvecs on compound page arrival
> 
> This commit breaks the test "test-amd64-amd64-xl-qemut-win7-amd64" in
> the Xen Project CI system.

Could you be more specific about the regression please?
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
  2016-12-14 13:39   ` Michal Hocko
@ 2016-12-14 15:14       ` Ian Jackson
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Jackson @ 2016-12-14 15:14 UTC (permalink / raw)
  To: Michal Hocko
  Cc: xen-devel, Lukasz Odzioba, Kirill Shutemov, Andrea Arcangeli,
	Vladimir Davydov, Ming Li, Minchan Kim, stable, Andrew Morton,
	Linus Torvalds, Sasha Levin

Michal Hocko writes ("Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival""):
> On Wed 14-12-16 13:29:56, Ian Jackson wrote:
> > This commit breaks the test "test-amd64-amd64-xl-qemut-win7-amd64" in
> > the Xen Project CI system.
> 
> Could you be more specific about the regression please?

The effect seems to be that it causes some kind of OOM condition
during boot under Xen:

Dec 14 08:00:04.637998 [   22.584134] Out of memory: Kill process 2747
(exim4) score 2 or sacrifice child

It's not quite clear to me but I think the problem may be hardware
specific.

Full logs are available here:

> > >   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/

See in particular this, which is the host serial console output:

http://logs.test-lab.xenproject.org/osstest/logs/103315/test-amd64-amd64-xl-qemut-win7-amd64/serial-elbling1.log

Start reading at Dec 14 07:59:33.478093.  At Dec 14 08:01:38.294433 a
log capture process started sending various debug keys to the console.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
@ 2016-12-14 15:14       ` Ian Jackson
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Jackson @ 2016-12-14 15:14 UTC (permalink / raw)
  To: Michal Hocko
  Cc: xen-devel, Lukasz Odzioba, Kirill Shutemov, Andrea Arcangeli,
	Vladimir Davydov, Ming Li, Minchan Kim, stable, Andrew Morton,
	Linus Torvalds, Sasha Levin

Michal Hocko writes ("Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival""):
> On Wed 14-12-16 13:29:56, Ian Jackson wrote:
> > This commit breaks the test "test-amd64-amd64-xl-qemut-win7-amd64" in
> > the Xen Project CI system.
> 
> Could you be more specific about the regression please?

The effect seems to be that it causes some kind of OOM condition
during boot under Xen:

Dec 14 08:00:04.637998 [   22.584134] Out of memory: Kill process 2747
(exim4) score 2 or sacrifice child

It's not quite clear to me but I think the problem may be hardware
specific.

Full logs are available here:

> > >   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/

See in particular this, which is the host serial console output:

http://logs.test-lab.xenproject.org/osstest/logs/103315/test-amd64-amd64-xl-qemut-win7-amd64/serial-elbling1.log

Start reading at Dec 14 07:59:33.478093.  At Dec 14 08:01:38.294433 a
log capture process started sending various debug keys to the console.

Thanks,
Ian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
  2016-12-14 13:29   ` Ian Jackson
@ 2016-12-14 15:21     ` Michal Hocko
  -1 siblings, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2016-12-14 15:21 UTC (permalink / raw)
  To: Ian Jackson
  Cc: xen-devel, Lukasz Odzioba, Kirill Shutemov, Andrea Arcangeli,
	Vladimir Davydov, Ming Li, Minchan Kim, stable, Andrew Morton,
	Linus Torvalds, Sasha Levin

On Wed 14-12-16 13:29:56, Ian Jackson wrote:
> osstest service owner writes ("[linux-3.18 bisection] complete test-amd64-amd64-xl-qemut-win7-amd64"):
> > *** Found and reproduced problem changeset ***
> > 
> >   Bug is in tree:  linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
> >   Bug introduced:  a2d8c514753276394d68414f563591f174ef86cb
> >   Bug not present: 8f620446135b64ca6f96cf32066a76d64e79a388
> >   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/
> > 
> >   commit a2d8c514753276394d68414f563591f174ef86cb
> >   Author: Lukasz Odzioba <lukasz.odzioba@intel.com>
> >   Date:   Fri Jun 24 14:50:01 2016 -0700
> >   
> >       mm/swap.c: flush lru pvecs on compound page arrival
> 
> This commit breaks the test "test-amd64-amd64-xl-qemut-win7-amd64" in
> the Xen Project CI system.

Ohh, I can see it now. This is not an upstream commit. This is a 3.18.37
backport which was wrong! You need the follow up fix 52c84a95dc6a
("4.1.28 Fix bad backport of 8f182270dfec "mm/swap.c: flush lru pvecs on
compound page arrival""). The primary problem was that __lru_cache_add
has leaked pages which would explain your OOM.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
@ 2016-12-14 15:21     ` Michal Hocko
  0 siblings, 0 replies; 13+ messages in thread
From: Michal Hocko @ 2016-12-14 15:21 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Andrea Arcangeli, Vladimir Davydov, xen-devel, Ming Li, stable,
	Minchan Kim, Lukasz Odzioba, Sasha Levin, Andrew Morton,
	Linus Torvalds, Kirill Shutemov

On Wed 14-12-16 13:29:56, Ian Jackson wrote:
> osstest service owner writes ("[linux-3.18 bisection] complete test-amd64-amd64-xl-qemut-win7-amd64"):
> > *** Found and reproduced problem changeset ***
> > 
> >   Bug is in tree:  linux git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
> >   Bug introduced:  a2d8c514753276394d68414f563591f174ef86cb
> >   Bug not present: 8f620446135b64ca6f96cf32066a76d64e79a388
> >   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/
> > 
> >   commit a2d8c514753276394d68414f563591f174ef86cb
> >   Author: Lukasz Odzioba <lukasz.odzioba@intel.com>
> >   Date:   Fri Jun 24 14:50:01 2016 -0700
> >   
> >       mm/swap.c: flush lru pvecs on compound page arrival
> 
> This commit breaks the test "test-amd64-amd64-xl-qemut-win7-amd64" in
> the Xen Project CI system.

Ohh, I can see it now. This is not an upstream commit. This is a 3.18.37
backport which was wrong! You need the follow up fix 52c84a95dc6a
("4.1.28 Fix bad backport of 8f182270dfec "mm/swap.c: flush lru pvecs on
compound page arrival""). The primary problem was that __lru_cache_add
has leaked pages which would explain your OOM.
-- 
Michal Hocko
SUSE Labs

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
  2016-12-14 15:14       ` Ian Jackson
@ 2016-12-14 16:46         ` Greg KH
  -1 siblings, 0 replies; 13+ messages in thread
From: Greg KH @ 2016-12-14 16:46 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Michal Hocko, xen-devel, Lukasz Odzioba, Kirill Shutemov,
	Andrea Arcangeli, Vladimir Davydov, Ming Li, Minchan Kim, stable,
	Andrew Morton, Linus Torvalds, Sasha Levin

On Wed, Dec 14, 2016 at 03:14:54PM +0000, Ian Jackson wrote:
> Michal Hocko writes ("Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival""):
> > On Wed 14-12-16 13:29:56, Ian Jackson wrote:
> > > This commit breaks the test "test-amd64-amd64-xl-qemut-win7-amd64" in
> > > the Xen Project CI system.
> > 
> > Could you be more specific about the regression please?
> 
> The effect seems to be that it causes some kind of OOM condition
> during boot under Xen:
> 
> Dec 14 08:00:04.637998 [   22.584134] Out of memory: Kill process 2747
> (exim4) score 2 or sacrifice child
> 
> It's not quite clear to me but I think the problem may be hardware
> specific.
> 
> Full logs are available here:
> 
> > > >   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/
> 
> See in particular this, which is the host serial console output:
> 
> http://logs.test-lab.xenproject.org/osstest/logs/103315/test-amd64-amd64-xl-qemut-win7-amd64/serial-elbling1.log
> 
> Start reading at Dec 14 07:59:33.478093.  At Dec 14 08:01:38.294433 a
> log capture process started sending various debug keys to the console.

And is this also an issue in Linus's tree?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
@ 2016-12-14 16:46         ` Greg KH
  0 siblings, 0 replies; 13+ messages in thread
From: Greg KH @ 2016-12-14 16:46 UTC (permalink / raw)
  To: Ian Jackson
  Cc: Andrea Arcangeli, Vladimir Davydov, xen-devel, Ming Li, stable,
	Michal Hocko, Minchan Kim, Lukasz Odzioba, Sasha Levin,
	Andrew Morton, Linus Torvalds, Kirill Shutemov

On Wed, Dec 14, 2016 at 03:14:54PM +0000, Ian Jackson wrote:
> Michal Hocko writes ("Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival""):
> > On Wed 14-12-16 13:29:56, Ian Jackson wrote:
> > > This commit breaks the test "test-amd64-amd64-xl-qemut-win7-amd64" in
> > > the Xen Project CI system.
> > 
> > Could you be more specific about the regression please?
> 
> The effect seems to be that it causes some kind of OOM condition
> during boot under Xen:
> 
> Dec 14 08:00:04.637998 [   22.584134] Out of memory: Kill process 2747
> (exim4) score 2 or sacrifice child
> 
> It's not quite clear to me but I think the problem may be hardware
> specific.
> 
> Full logs are available here:
> 
> > > >   Last fail repro: http://logs.test-lab.xenproject.org/osstest/logs/103315/
> 
> See in particular this, which is the host serial console output:
> 
> http://logs.test-lab.xenproject.org/osstest/logs/103315/test-amd64-amd64-xl-qemut-win7-amd64/serial-elbling1.log
> 
> Start reading at Dec 14 07:59:33.478093.  At Dec 14 08:01:38.294433 a
> log capture process started sending various debug keys to the console.

And is this also an issue in Linus's tree?

thanks,

greg k-h

_______________________________________________
Xen-devel mailing list
Xen-devel@lists.xen.org
https://lists.xen.org/xen-devel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
  2016-12-14 16:46         ` Greg KH
@ 2016-12-14 16:58           ` Ian Jackson
  -1 siblings, 0 replies; 13+ messages in thread
From: Ian Jackson @ 2016-12-14 16:58 UTC (permalink / raw)
  To: Greg KH
  Cc: Michal Hocko, xen-devel, Lukasz Odzioba, Kirill Shutemov,
	Andrea Arcangeli, Vladimir Davydov, Ming Li, Minchan Kim, stable,
	Andrew Morton, Linus Torvalds, Sasha Levin

Greg KH writes ("Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival""):
> On Wed, Dec 14, 2016 at 03:14:54PM +0000, Ian Jackson wrote:
> > Start reading at Dec 14 07:59:33.478093.  At Dec 14 08:01:38.294433 a
> > log capture process started sending various debug keys to the console.
> 
> And is this also an issue in Linus's tree?

Sorry, no, I was unclear: this is a problem in 3.18.y.

Ian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
@ 2016-12-14 16:58           ` Ian Jackson
  0 siblings, 0 replies; 13+ messages in thread
From: Ian Jackson @ 2016-12-14 16:58 UTC (permalink / raw)
  To: Greg KH
  Cc: Michal Hocko, xen-devel, Lukasz Odzioba, Kirill Shutemov,
	Andrea Arcangeli, Vladimir Davydov, Ming Li, Minchan Kim, stable,
	Andrew Morton, Linus Torvalds, Sasha Levin

Greg KH writes ("Re: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival""):
> On Wed, Dec 14, 2016 at 03:14:54PM +0000, Ian Jackson wrote:
> > Start reading at Dec 14 07:59:33.478093.  At Dec 14 08:01:38.294433 a
> > log capture process started sending various debug keys to the console.
> 
> And is this also an issue in Linus's tree?

Sorry, no, I was unclear: this is a problem in 3.18.y.

Ian.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* RE: Regression due to "mm/swap.c: flush lru pvecs on compound page arrival"
  2016-12-14 15:21     ` Michal Hocko
  (?)
@ 2016-12-19 18:53     ` Odzioba, Lukasz
  -1 siblings, 0 replies; 13+ messages in thread
From: Odzioba, Lukasz @ 2016-12-19 18:53 UTC (permalink / raw)
  To: Michal Hocko, Ian Jackson
  Cc: xen-devel, Kirill Shutemov, Andrea Arcangeli, Vladimir Davydov,
	Ming Li, Minchan Kim, stable, Andrew Morton, Linus Torvalds,
	Sasha Levin

On Wednesday, December 14, 2016 4:22 PM Michal Hocko, wrote: 
> Ohh, I can see it now. This is not an upstream commit. This is a 3.18.37
> backport which was wrong! You need the follow up fix 52c84a95dc6a
> ("4.1.28 Fix bad backport of 8f182270dfec "mm/swap.c: flush lru pvecs on
> compound page arrival""). The primary problem was that __lru_cache_add
> has leaked pages which would explain your OOM.

Ian did it solve the problem for you?

Thanks,
Lukas



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2016-12-19 18:53 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2016-12-14  8:34 [linux-3.18 bisection] complete test-amd64-amd64-xl-qemut-win7-amd64 osstest service owner
2016-12-14 13:29 ` Regression due to "mm/swap.c: flush lru pvecs on compound page arrival" Ian Jackson
2016-12-14 13:29   ` Ian Jackson
2016-12-14 13:39   ` Michal Hocko
2016-12-14 15:14     ` Ian Jackson
2016-12-14 15:14       ` Ian Jackson
2016-12-14 16:46       ` Greg KH
2016-12-14 16:46         ` Greg KH
2016-12-14 16:58         ` Ian Jackson
2016-12-14 16:58           ` Ian Jackson
2016-12-14 15:21   ` Michal Hocko
2016-12-14 15:21     ` Michal Hocko
2016-12-19 18:53     ` Odzioba, Lukasz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.