0
点赞
收藏
分享

微信扫一扫

linux内核那些事之buddy(anti-fragment机制-steal page)(5)

caoxingyu 2022-02-20 阅读 108

继<linux内核那些事之buddy(anti-fragment机制)(4)>,在同一个zone内指定的migrate type中没有足够内存,会启动fallback机制,从fallbacks数组中寻找到合适其他type中获取到steal page,实施steal page核心处理函数为steal_suitable_fallback。

steal_suitable_fallback

steal_suitable_fallback定义如下:

void steal_suitable_fallback(struct zone *zone, struct page *page,
		unsigned int alloc_flags, int start_type, bool whole_block)

函数功能:

  • 实施steal page核心功能函数:steal 页面时是否需要修改page block migrate type属性。当order足够大时,会一次性将整个pageblock迁移过来,同时修改page block migratetype 。当只能steal pageblock中的一部分内存中,则并不修改page block migratetype意味着当前page block处于compatible migratetype 即一部分被其他migrateype使用。逻辑大概如下:

 参数:

  • struct zone *zone: 所申请page位于zone.
  • struct page *page:所要开始steal page物理页。
  • unsigned int alloc_flags:申请内存使用的alloc flags。
  • int start_type:申请内存所指定的migrate type。
  • bool whole_block: 是否steal 整个page block。

steal_suitable_fallback流程

steal_suitable_fallback处理流程如下:

 steal_suitable_fallback源码

结合steal_suitable_fallback源码分析:


/*
 * This function implements actual steal behaviour. If order is large enough,
 * we can steal whole pageblock. If not, we first move freepages in this
 * pageblock to our migratetype and determine how many already-allocated pages
 * are there in the pageblock with a compatible migratetype. If at least half
 * of pages are free or compatible, we can change migratetype of the pageblock
 * itself, so pages freed in the future will be put on the correct free list.
 */
static void steal_suitable_fallback(struct zone *zone, struct page *page,
		unsigned int alloc_flags, int start_type, bool whole_block)
{
	unsigned int current_order = page_order(page);
	int free_pages, movable_pages, alike_pages;
	int old_block_type;

	old_block_type = get_pageblock_migratetype(page);

	/*
	 * This can happen due to races and we want to prevent broken
	 * highatomic accounting.
	 */
	if (is_migrate_highatomic(old_block_type))
		goto single_page;

	/* Take ownership for orders >= pageblock_order */
	if (current_order >= pageblock_order) {
		change_pageblock_range(page, current_order, start_type);
		goto single_page;
	}

	/*
	 * Boost watermarks to increase reclaim pressure to reduce the
	 * likelihood of future fallbacks. Wake kswapd now as the node
	 * may be balanced overall and kswapd will not wake naturally.
	 */
	boost_watermark(zone);
	if (alloc_flags & ALLOC_KSWAPD)
		set_bit(ZONE_BOOSTED_WATERMARK, &zone->flags);

	/* We are not allowed to try stealing from the whole block */
	if (!whole_block)
		goto single_page;

	free_pages = move_freepages_block(zone, page, start_type,
						&movable_pages);
	/*
	 * Determine how many pages are compatible with our allocation.
	 * For movable allocation, it's the number of movable pages which
	 * we just obtained. For other types it's a bit more tricky.
	 */
	if (start_type == MIGRATE_MOVABLE) {
		alike_pages = movable_pages;
	} else {
		/*
		 * If we are falling back a RECLAIMABLE or UNMOVABLE allocation
		 * to MOVABLE pageblock, consider all non-movable pages as
		 * compatible. If it's UNMOVABLE falling back to RECLAIMABLE or
		 * vice versa, be conservative since we can't distinguish the
		 * exact migratetype of non-movable pages.
		 */
		if (old_block_type == MIGRATE_MOVABLE)
			alike_pages = pageblock_nr_pages
						- (free_pages + movable_pages);
		else
			alike_pages = 0;
	}

	/* moving whole block can fail due to zone boundary conditions */
	if (!free_pages)
		goto single_page;

	/*
	 * If a sufficient number of pages in the block are either free or of
	 * comparable migratability as our allocation, claim the whole block.
	 */
	if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
			page_group_by_mobility_disabled)
		set_pageblock_migratetype(page, start_type);

	return;

single_page:
	move_to_free_list(page, zone, current_order, start_type);
}
  • 根据要steal(也可以称为迁移)的page获取到对应page block的migrate type(迁移属性)。
  • 如果page迁移属性为MIGRATE_HIGHATOMIC,则说明order为0,不做paga block迁移属性,直接调用move_to_free_list将页面迁移到对应freelist中即可。
  • 如果steal page的current_order大于等于pageblock_order,则说明要迁移的page 至少要大于一个page block,直接调用change_pageblock_range,修改对应page block迁移属性,并将要迁移页面move_to_free_list将页面迁移到对应freelist。
  • 如果上述情况都不是,需要进一步判断是否可以修改page block migrate type。
  • 首先修改zone boost water mark,决定kswapd回收内存尺度。
  • 如果alloc_flags设置ALLOC_KSWAPD,则内存发生迁移因为内存不足,可以提前触发KSWAPD线程进行内存规整等操作以便提前整理空闲物理内存
  • whole_block如果为false,则说明不能做修改整个page block迁移属性,只将做页面迁移不做属性迁移。
  • move_freepages_block: 将指定空闲页进行页迁移,当迁移的空闲页数量和alike_pages大于>=pageblock_order,说明进行的时整个page block,需要修改其page block迁移属性
  • move_to_free_list:页面迁移,将页面从旧的migrate type中的free list迁移到新的migrate type free list中,后续新的migrate type中将有足够内存用于此次内存申请。

允许修改/迁移page block migrate type准则

由steal_suitable_fallback流程可以得出允许修改/迁移page block migrate type准则:

  • page 所对应oder 直接大于或者等于pageblock_order,允许做page block迁移属性修改。
  • 当page 对应小于pageblock_order时,whole_block为false 说明不允许修改page block migrate type。
  • 当page 对应小于pageblock_order时,whole_block为true是,需要判断page block原有属性为MIGRATE_MOVABLE,则说明page block原本就可以迁移,可以直接修改page block migrate type
  • 当page 对应小于pageblock_order时,whole_block为true是,page block原有属性不是MIGRATE_MOVABLE,则需要根据page block里面迁移的free_pages空闲页数量和可以利用alike_pages, 如果(free_pages+alike_pages) >pageblock_order,允许修改page block migrate type.
  • 其他情况不允许做修改page block migrate type。

修改页迁移属性使用set_pageblock_migratetype()函数。

页迁移

页迁移使用move_to_free_list接口,将page从原有的free list中删除同时加入到新的migrate type对应free list:

/* Used for pages which are on another list */
static inline void move_to_free_list(struct page *page, struct zone *zone,
				     unsigned int order, int migratetype)
{
	struct free_area *area = &zone->free_area[order];

	list_move(&page->lru, &area->free_list[migratetype]);
}

page_group_by_mobility_disabled

page block migrate type可以通过page_group_by_mobility_disabled 开启和关闭,当系统启动过程对zone 进行初始化,会根据zone内的物理内存实际情况进行判断:


/*
 * unless system_state == SYSTEM_BOOTING.
 *
 * __ref due to call of __init annotated helper build_all_zonelists_init
 * [protected by SYSTEM_BOOTING].
 */
void __ref build_all_zonelists(pg_data_t *pgdat)
{
    ... ...
	/*
	 * Disable grouping by mobility if the number of pages in the
	 * system is too low to allow the mechanism to work. It would be
	 * more accurate, but expensive to check per-zone. This check is
	 * made on memory-hotadd so a system can start with mobility
	 * disabled and enable it later
	 */
	if (vm_total_pages < (pageblock_nr_pages * MIGRATE_TYPES))
		page_group_by_mobility_disabled = 1;
	else
		page_group_by_mobility_disabled = 0;

	pr_info("Built %u zonelists, mobility grouping %s.  Total pages: %ld\n",
		nr_online_nodes,
		page_group_by_mobility_disabled ? "off" : "on",
		vm_total_pages);
#ifdef CONFIG_NUMA
	pr_info("Policy zone: %s\n", zone_names[policy_zone]);
#endif
}
  • 当内存小于pageblock_nr_pages * MIGRATE_TYPES 物理页时,将把migrate type特性关闭。

move_freepages_block

move_freepages_block()函数,是当指定要page block允许做迁移,需要将page 迁移到对应空闲页中:


int move_freepages_block(struct zone *zone, struct page *page,
				int migratetype, int *num_movable)
{
	unsigned long start_pfn, end_pfn;
	struct page *start_page, *end_page;

	if (num_movable)
		*num_movable = 0;

	start_pfn = page_to_pfn(page);
	start_pfn = start_pfn & ~(pageblock_nr_pages-1);
	start_page = pfn_to_page(start_pfn);
	end_page = start_page + pageblock_nr_pages - 1;
	end_pfn = start_pfn + pageblock_nr_pages - 1;

	/* Do not cross zone boundaries */
	if (!zone_spans_pfn(zone, start_pfn))
		start_page = page;
	if (!zone_spans_pfn(zone, end_pfn))
		return 0;

	return move_freepages(zone, start_page, end_page, migratetype,
								num_movable);
}
  • start_pfn = page_to_pfn(page):获取要迁移page 的pfn。
  • start_pfn = start_pfn & ~(pageblock_nr_pages-1):将页帧号pfn 与pagbe block对齐。
  • start_page = pfn_to_page(start_pfn);page block对齐之后的起始物理页面。
  • end_page = start_page + pageblock_nr_pages - 1:对应page block 结束物理页面
  • end_pfn = start_pfn + pageblock_nr_pages - 1:结束pfn:
  • 分别对start_pfn和end_pfn做检查
  • move_freepages:按照指定范围成批迁移页面。

move_freepages

move_freepages将指定范围的页面,迁移到指定的migrate type free list中:


/*
 * Move the free pages in a range to the free lists of the requested type.
 * Note that start_page and end_pages are not aligned on a pageblock
 * boundary. If alignment is required, use move_freepages_block()
 */
static int move_freepages(struct zone *zone,
			  struct page *start_page, struct page *end_page,
			  int migratetype, int *num_movable)
{
	struct page *page;
	unsigned int order;
	int pages_moved = 0;

	for (page = start_page; page <= end_page;) {
		if (!pfn_valid_within(page_to_pfn(page))) {
			page++;
			continue;
		}

		if (!PageBuddy(page)) {
			/*
			 * We assume that pages that could be isolated for
			 * migration are movable. But we don't actually try
			 * isolating, as that would be expensive.
			 */
			if (num_movable &&
					(PageLRU(page) || __PageMovable(page)))
				(*num_movable)++;

			page++;
			continue;
		}

		/* Make sure we are not inadvertently changing nodes */
		VM_BUG_ON_PAGE(page_to_nid(page) != zone_to_nid(zone), page);
		VM_BUG_ON_PAGE(page_zone(page) != zone, page);

		order = page_order(page);
		move_to_free_list(page, zone, order, migratetype);
		page += 1 << order;
		pages_moved += 1 << order;
	}

	return pages_moved;
}
  • 按照指定范围的页面做迁移,从page block的对齐起始页开始
  • move_to_free_list:将整个page block迁移到对应migate type中。
  • 循环下一page block。

can_steal_fallback

can_steal_fallback 根据order和migrate type判断,steal page时,是否允许将整个page block进行迁移,如果允许,则将整个page block迁移并且修改page block migrate type:


/*
 * When we are falling back to another migratetype during allocation, try to
 * steal extra free pages from the same pageblocks to satisfy further
 * allocations, instead of polluting multiple pageblocks.
 *
 * If we are stealing a relatively large buddy page, it is likely there will
 * be more free pages in the pageblock, so try to steal them all. For
 * reclaimable and unmovable allocations, we steal regardless of page size,
 * as fragmentation caused by those allocations polluting movable pageblocks
 * is worse than movable allocations stealing from unmovable and reclaimable
 * pageblocks.
 */
static bool can_steal_fallback(unsigned int order, int start_mt)
{
	/*
	 * Leaving this order check is intended, although there is
	 * relaxed order check in next check. The reason is that
	 * we can actually steal whole pageblock if this condition met,
	 * but, below check doesn't guarantee it and that is just heuristic
	 * so could be changed anytime.
	 */
	if (order >= pageblock_order)
		return true;

	if (order >= pageblock_order / 2 ||
		start_mt == MIGRATE_RECLAIMABLE ||
		start_mt == MIGRATE_UNMOVABLE ||
		page_group_by_mobility_disabled)
		return true;

	return false;
}
  • order >= pageblock_order :当oder大于等 pageblock是,说明至少需要一个 pageblock,允许将整个 pageblock做迁移
  • order 》=pageblock_order /2同样也允许做 pageblock做迁移
  • MIGRATE_RECLAIMABLE:说明 pageblock可回收,也可以直接做整个 pageblock做迁移
  • MIGRATE_UNMOVABLE:可以直接做整个 pageblock做迁移。
举报

相关推荐

0 条评论