From c5fbd937b603885f1db3280ca212ed28add895bc Mon Sep 17 00:00:00 2001 From: Mel Gorman Date: Tue, 5 Mar 2019 15:44:25 -0800 Subject: [PATCH] mm, compaction: shrink compact_control Patch series "Increase success rates and reduce latency of compaction", v3. This series reduces scan rates and success rates of compaction, primarily by using the free lists to shorten scans, better controlling of skip information and whether multiple scanners can target the same block and capturing pageblocks before being stolen by parallel requests. The series is based on mmotm from January 9th, 2019 with the previous compaction series reverted. I'm mostly using thpscale to measure the impact of the series. The benchmark creates a large file, maps it, faults it, punches holes in the mapping so that the virtual address space is fragmented and then tries to allocate THP. It re-executes for different numbers of threads. From a fragmentation perspective, the workload is relatively benign but it does stress compaction. The overall impact on latencies for a 1-socket machine is baseline patches Amean fault-both-3 3832.09 ( 0.00%) 2748.56 * 28.28%* Amean fault-both-5 4933.06 ( 0.00%) 4255.52 ( 13.73%) Amean fault-both-7 7017.75 ( 0.00%) 6586.93 ( 6.14%) Amean fault-both-12 11610.51 ( 0.00%) 9162.34 * 21.09%* Amean fault-both-18 17055.85 ( 0.00%) 11530.06 * 32.40%* Amean fault-both-24 19306.27 ( 0.00%) 17956.13 ( 6.99%) Amean fault-both-30 22516.49 ( 0.00%) 15686.47 * 30.33%* Amean fault-both-32 23442.93 ( 0.00%) 16564.83 * 29.34%* The allocation success rates are much improved baseline patches Percentage huge-3 85.99 ( 0.00%) 97.96 ( 13.92%) Percentage huge-5 88.27 ( 0.00%) 96.87 ( 9.74%) Percentage huge-7 85.87 ( 0.00%) 94.53 ( 10.09%) Percentage huge-12 82.38 ( 0.00%) 98.44 ( 19.49%) Percentage huge-18 83.29 ( 0.00%) 99.14 ( 19.04%) Percentage huge-24 81.41 ( 0.00%) 97.35 ( 19.57%) Percentage huge-30 80.98 ( 0.00%) 98.05 ( 21.08%) Percentage huge-32 80.53 ( 0.00%) 97.06 ( 20.53%) That's a nearly perfect allocation success rate. The biggest impact is on the scan rates Compaction migrate scanned 55893379 19341254 Compaction free scanned 474739990 11903963 The number of pages scanned for migration was reduced by 65% and the free scanner was reduced by 97.5%. So much less work in exchange for lower latency and better success rates. The series was also evaluated using a workload that heavily fragments memory but the benefits there are also significant, albeit not presented. It was commented that we should be rethinking scanning entirely and to a large extent I agree. However, to achieve that you need a lot of this series in place first so it's best to make the linear scanners as best as possible before ripping them out. This patch (of 22): The isolate and migrate scanners should never isolate more than a pageblock of pages so unsigned int is sufficient saving 8 bytes on a 64-bit build. Link: http://lkml.kernel.org/r/20190118175136.31341-2-mgorman@techsingularity.net Signed-off-by: Mel Gorman Acked-by: Vlastimil Babka Cc: David Rientjes Cc: Andrea Arcangeli Cc: Dan Carpenter Cc: YueHaibing Signed-off-by: Andrew Morton Signed-off-by: Linus Torvalds --- mm/internal.h | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mm/internal.h b/mm/internal.h index 536bc2a839b9..5564841fce36 100644 --- a/mm/internal.h +++ b/mm/internal.h @@ -185,8 +185,8 @@ struct compact_control { struct list_head freepages; /* List of free pages to migrate to */ struct list_head migratepages; /* List of pages being migrated */ struct zone *zone; - unsigned long nr_freepages; /* Number of isolated free pages */ - unsigned long nr_migratepages; /* Number of pages to migrate */ + unsigned int nr_freepages; /* Number of isolated free pages */ + unsigned int nr_migratepages; /* Number of pages to migrate */ unsigned long total_migrate_scanned; unsigned long total_free_scanned; unsigned long free_pfn; /* isolate_freepages search base */