Android lmkd机制详解
目录
一、lmkd介绍
二、lmkd实现原理
2.1 工作原理图
2.2 初始化
2.3 oom_adj获取
2.4 监听psi事件及处理
2.5 进程选取与查杀
2.5.1 进程选取
2.5.2 进程查杀
三、关键系统属性
四、核心数据结构
五、代码时序
一、lmkd介绍
Android lmkd采用epoll方式监听linux内核psi内存压力等级的触发事件,并根据psi内存压力等级及进程优先级(oom_adj)来选择目标进程并查杀,缓解系统内存的压力。
二、lmkd实现原理
2.1 工作原理图
工作原理主要分为3部分:
1)app启动时,AMS将oom_adj并通过socket发送给lmkd进程,并由lmkd存起来
2)lmkd进程进入循环监听kernel psi事件
3)当发生psi事件时,根据lmkd策略进行进程查杀
2.2 初始化
在lmkd启动时,会进行初始化。主要做了两个事:
1)启动 lmkd socket,进入监听,等待client端连接
2)初始化psi monitor
static int init(void) {...update_psi_window_size();...// 1.设置并监听 lmkd 控制套接字,以便处理来自客户端的连接请求ctrl_sock.sock = android_get_control_socket("lmkd");listen(ctrl_sock.sock, MAX_DATA_CONN);ctrl_sock.handler_info.handler = ctrl_connect_handler;epev.data.ptr = (void *)&(ctrl_sock.handler_info);epoll_ctl(epollfd, EPOLL_CTL_ADD, ctrl_sock.sock, &epev);...// 2.初始化psi monitorinit_monitors();// 3. 初始化reaper,创建杀进程的异步线程init_reaper();...
}
具体看如何进行psi poll监听及注册回调函数:
static bool init_monitors() {/* Try to use psi monitor first if kernel has it */// 采用psi内存压力监听---新策略use_psi_monitors = GET_LMK_PROPERTY(bool, "use_psi", true) &&init_psi_monitors();// vmpressure老策略,不会走/* Fall back to vmpressure */if (!use_psi_monitors &&(!init_mp_common(VMPRESS_LEVEL_LOW) ||!init_mp_common(VMPRESS_LEVEL_MEDIUM) ||!init_mp_common(VMPRESS_LEVEL_CRITICAL))) {ALOGE("Kernel does not support memory pressure events or in-kernel low memory killer");return false;}...
}static bool init_psi_monitors() {/** When PSI is used on low-ram devices or on high-end devices without memfree levels* use new kill strategy based on zone watermarks, free swap and thrashing stats.* Also use the new strategy if memcg has not been mounted in the v1 cgroups hiearchy since* the old strategy relies on memcg attributes that are available only in the v1 cgroups* hiearchy.*/bool use_new_strategy =GET_LMK_PROPERTY(bool, "use_new_strategy", low_ram_device || !use_minfree_levels);if (force_use_old_strategy) {use_new_strategy = false;}/* In default PSI mode override stall amounts using system properties */if (use_new_strategy) {/* Do not use low pressure level */psi_thresholds[VMPRESS_LEVEL_LOW].threshold_ms = 0;psi_thresholds[VMPRESS_LEVEL_MEDIUM].threshold_ms = psi_partial_stall_ms;psi_thresholds[VMPRESS_LEVEL_CRITICAL].threshold_ms = psi_complete_stall_ms;} else {psi_thresholds[VMPRESS_LEVEL_LOW].threshold_ms = PSI_OLD_LOW_THRESH_MS;psi_thresholds[VMPRESS_LEVEL_MEDIUM].threshold_ms = PSI_OLD_MED_THRESH_MS;psi_thresholds[VMPRESS_LEVEL_CRITICAL].threshold_ms = PSI_OLD_CRIT_THRESH_MS;}// 分别对VMPRESS_LEVEL_LOW、VMPRESS_LEVEL_MEDIUM、VMPRESS_LEVEL_CRITICAL、VMPRESS_LEVEL_SUPER_CRITICAL// 压力等级的节点init_mp_psi(VMPRESS_LEVEL_LOW, use_new_strategy);init_mp_psi(VMPRESS_LEVEL_MEDIUM, use_new_strategy);init_mp_psi(VMPRESS_LEVEL_CRITICAL, use_new_strategy));init_mp_psi(VMPRESS_LEVEL_SUPER_CRITICAL, use_new_strategy);...
}static bool init_mp_psi(enum vmpressure_level level, bool use_new_strategy) {int fd;/* Do not register a handler if threshold_ms is not set */if (!psi_thresholds[level].threshold_ms) {return true;}// 1.初始化psi,往"/proc/pressure/memory"节点中写入初始值fd = init_psi_monitor(psi_thresholds[level].stall_type,psi_thresholds[level].threshold_ms * US_PER_MS,psi_window_size_ms * US_PER_MS);// 监听函数---mp_event_psivmpressure_hinfo[level].handler = use_new_strategy ? mp_event_psi : mp_event_common;vmpressure_hinfo[level].data = level;// 2.注册poll事件的监听回调函数register_psi_monitor(epollfd, fd, &vmpressure_hinfo[level]);...
}int init_psi_monitor(enum psi_stall_type stall_type,int threshold_us, int window_us) {...// 打开"/proc/pressure/memory"节点fd = TEMP_FAILURE_RETRY(open(PSI_PATH_MEMORY, O_WRONLY | O_CLOEXEC));...res = TEMP_FAILURE_RETRY(write(fd, buf, strlen(buf) + 1));...
}// 注册事件监听回调函数
int register_psi_monitor(int epollfd, int fd, void* data) {int res;struct epoll_event epev;epev.events = EPOLLPRI;epev.data.ptr = data;res = epoll_ctl(epollfd, EPOLL_CTL_ADD, fd, &epev);if (res < 0) {ALOGE("epoll_ctl for psi monitor failed; errno=%d", errno);}return res;
}
2.3 oom_adj获取
进程启动后会通过AMS服务将oom_adj信息通过lmkd socket传给lmkd进程,存入到adjslot_list结构体数组,用于根据oom_adj来选取进程查杀。
基本流程:
1)init初始化阶段,设置lmkd socket的epoll监听
2)AMS客户端写入数据到lmkd socket
3)触发lmkd socket的epoll事件,服务端接收数据,并执行处理函数,将进程及oom_adj信息存入adjslot_list结构体数组
核心代码,如下:
1)lmkd代码
static int init(void) {...//设置并监听 lmkd 控制套接字,以便处理来自客户端的连接请求ctrl_sock.sock = android_get_control_socket("lmkd");listen(ctrl_sock.sock, MAX_DATA_CONN);ctrl_sock.handler_info.handler = ctrl_connect_handler; // epoll事件处理函数epev.data.ptr = (void *)&(ctrl_sock.handler_info);epoll_ctl(epollfd, EPOLL_CTL_ADD, ctrl_sock.sock, &epev);...
}static void ctrl_data_handler(int data, uint32_t events,struct polling_params *poll_params __unused) {if (events & EPOLLIN) {ctrl_command_handler(data);}
}static void ctrl_command_handler(int dsock_idx) {...// 读取lmkd socket中数据ctrl_data_read(dsock_idx, (char *)packet, CTRL_PACKET_MAX_SIZE, &cred);cmd = lmkd_pack_get_cmd(packet);switch(cmd) {// AMS客户端会往lmkd socket中写入LMK_PROCPRIO及进程信息(pid、oom_adj)case LMK_PROCPRIO:/* process type field is optional for backward compatibility */if (nargs < 3 || nargs > 4)goto wronglen;cmd_procprio(packet, nargs, &cred);break;...
}static void cmd_procprio(LMKD_CTRL_PACKET packet, int field_count, struct ucred *cred) {...proc_slot(procp);
}static void proc_slot(struct proc *procp) {int adjslot = ADJTOSLOT(procp->oomadj);std::scoped_lock lock(adjslot_list_lock);adjslot_insert(&procadjslot_list[adjslot], &procp->asl);
}static void adjslot_insert(struct adjslot_list *head, struct adjslot_list *new_element)
{struct adjslot_list *next = head->next;new_element->prev = head;new_element->next = next;next->prev = new_element;head->next = new_element;
}
2)AMS代码
// frameworks/base/services/core/java/com/android/server/am/ProcessList.java /*** Set the out-of-memory badness adjustment for a process.* If {@code pid <= 0}, this method will be a no-op.** @param pid The process identifier to set.* @param uid The uid of the app* @param amt Adjustment value -- lmkd allows -1000 to +1000** {@hide}*/
public static void setOomAdj(int pid, int uid, int amt) {// This indicates that the process is not started yet and so no need to proceed further.if (pid <= 0) {return;}if (amt == UNKNOWN_ADJ)return;long start = SystemClock.elapsedRealtime();ByteBuffer buf = ByteBuffer.allocate(4 * 4);buf.putInt(LMK_PROCPRIO);buf.putInt(pid);buf.putInt(uid);buf.putInt(amt);// 往lmkd socket中写数据writeLmkd(buf, null);long now = SystemClock.elapsedRealtime();if ((now-start) > 250) {Slog.w("ActivityManager", "SLOW OOM ADJ: " + (now-start) + "ms for pid " + pid+ " = " + amt);}
}
2.4 监听psi事件及处理
在初始化阶段,已经对psi事件做了监听及注册了回调函数---mp_event_psi,当监听到psi事件时,将回调mp_event_psi.
主要做了以下事情:
1)计算和更新文件页缓存(file-backed pagecache)的回写(refault)和抖动(thrashing)相关的指标
2)根据这些指标和其他系统状态决定是否需要杀死进程以及杀死进程的条件和原因
static void mp_event_psi(int data, uint32_t events, struct polling_params *poll_params) {...if (level == VMPRESS_LEVEL_LOW) {if (enable_preferred_apps &&(get_time_diff_ms(&last_pa_update_tm, &curr_tm) >= pa_update_timeout_ms)) {if (!use_perf_api_for_pref_apps) {if (perf_ux_engine_trigger) {perf_ux_engine_trigger(PAPP_OPCODE, preferred_apps);}} else {if (perf_sync_request) {const char * tmp = perf_sync_request(PAPP_PERF_TRIGGER);if (tmp != NULL) {strlcpy(preferred_apps, tmp, strlen(tmp));free((void *)tmp);}}}last_pa_update_tm = curr_tm;}}record_wakeup_time(&curr_tm, events ? Event : Polling, &wi);bool kill_pending = is_kill_pending();if (kill_pending && (kill_timeout_ms == 0 ||get_time_diff_ms(&last_kill_tm, &curr_tm) < static_cast<long>(kill_timeout_ms))) {/* Skip while still killing a process */wi.skipped_wakeups++;ULMK_LOG(D, "Ignoring %s pressure event; kill already in progress",level_name[level]);goto no_kill;}/** Process is dead or kill timeout is over, stop waiting. This has no effect if pidfds are* supported and death notification already caused waiting to stop.*/stop_wait_for_proc_kill(!kill_pending);if (vmstat_parse(&vs) < 0) {ALOGE("Failed to parse vmstat!");return;}/* Starting 5.9 kernel workingset_refault vmstat field was renamed workingset_refault_file */workingset_refault_file = vs.field.workingset_refault ? : vs.field.workingset_refault_file;if (meminfo_parse(&mi) < 0) {ALOGE("Failed to parse meminfo!");return;}/* Reset states after process got killed */if (killing) {killing = false;cycle_after_kill = true;/* Reset file-backed pagecache size and refault amounts after a kill */base_file_lru = vs.field.nr_inactive_file + vs.field.nr_active_file;init_ws_refault = workingset_refault_file;thrashing_reset_tm = curr_tm;prev_thrash_growth = 0;}if (vs.field.compact_stall > init_compact_stall) {init_compact_stall = vs.field.compact_stall;in_compaction = true;}/* Identify reclaim state */if (vs.field.pgscan_direct != init_pgscan_direct) {init_pgscan_direct = vs.field.pgscan_direct;init_pgscan_kswapd = vs.field.pgscan_kswapd;init_pgrefill = vs.field.pgrefill;for (i = VS_PGSKIP_FIRST_ZONE; i <= VS_PGSKIP_LAST_ZONE; i++) {init_pgskip[PGSKIP_IDX(i)] = vs.arr[i];}reclaim = DIRECT_RECLAIM;} else if (vs.field.pgscan_direct_throttle > init_direct_throttle) {init_direct_throttle = vs.field.pgscan_direct_throttle;reclaim = DIRECT_RECLAIM_THROTTLE;} else if (vs.field.pgscan_kswapd > init_pgscan_kswapd) {init_pgscan_kswapd = vs.field.pgscan_kswapd;init_pgrefill = vs.field.pgrefill;for (i = VS_PGSKIP_FIRST_ZONE; i <= VS_PGSKIP_LAST_ZONE; i++) {init_pgskip[PGSKIP_IDX(i)] = vs.arr[i];}reclaim = KSWAPD_RECLAIM;} else if (vs.field.pgrefill != init_pgrefill) {init_pgrefill = vs.field.pgrefill;for (i = VS_PGSKIP_FIRST_ZONE; i <= VS_PGSKIP_LAST_ZONE; i++) {init_pgskip[PGSKIP_IDX(i)] = vs.arr[i];}/** On a system with only 2 zones, pgrefill indicating that pages are not eligible.* Then there may be real refilling happens for normal zone pages too.** This makes to consider only normal zone stats when system is under reclaim, under* calc_zone_watermarks.*/if (MGLRU_status) {pgskip_deltas[PGSKIP_IDX(VS_PGSKIP_MOVABLE)] = 1;}reclaim = PGREFILL;} else if (workingset_refault_file == prev_workingset_refault) {if (enable_preferred_apps &&(get_time_diff_ms(&last_pa_update_tm, &curr_tm) >= pa_update_timeout_ms)) {if (!use_perf_api_for_pref_apps) {if (perf_ux_engine_trigger) {perf_ux_engine_trigger(PAPP_OPCODE, preferred_apps);}} else {if (perf_sync_request) {const char * tmp = perf_sync_request(PAPP_PERF_TRIGGER);if (tmp != NULL) {strlcpy(preferred_apps, tmp, strlen(tmp));free((void *)tmp);}}}last_pa_update_tm = curr_tm;}}prev_workingset_refault = workingset_refault_file;/** It's possible we fail to find an eligible process to kill (ex. no process is* above oom_adj_min). When this happens, we should retry to find a new process* for a kill whenever a new eligible process is available. This is especially* important for a slow growing refault case. While retrying, we should keep* monitoring new thrashing counter as someone could release the memory to mitigate* the thrashing. Thus, when thrashing reset window comes, we decay the prev thrashing* counter by window counts. If the counter is still greater than thrashing limit,* we preserve the current prev_thrash counter so we will retry kill again. Otherwise,* we reset the prev_thrash counter so we will stop retrying.*/since_thrashing_reset_ms = get_time_diff_ms(&thrashing_reset_tm, &curr_tm);if (since_thrashing_reset_ms > THRASHING_RESET_INTERVAL_MS) {long windows_passed;/* Calculate prev_thrash_growth if we crossed THRASHING_RESET_INTERVAL_MS */prev_thrash_growth = (workingset_refault_file - init_ws_refault) * 100/ (base_file_lru + 1);windows_passed = (since_thrashing_reset_ms / THRASHING_RESET_INTERVAL_MS);/** Decay prev_thrashing unless over-the-limit thrashing was registered in the window we* just crossed, which means there were no eligible processes to kill. We preserve the* counter in that case to ensure a kill if a new eligible process appears.*/if (windows_passed > 1 || prev_thrash_growth < thrashing_limit) {prev_thrash_growth >>= windows_passed;}/* Record file-backed pagecache size when crossing THRASHING_RESET_INTERVAL_MS */base_file_lru = vs.field.nr_inactive_file + vs.field.nr_active_file;init_ws_refault = workingset_refault_file;thrashing_reset_tm = curr_tm;thrashing_limit = thrashing_limit_pct;} else {/* Calculate what % of the file-backed pagecache refaulted so far */thrashing = (workingset_refault_file - init_ws_refault) * 100 / (base_file_lru + 1);}/* Add previous cycle's decayed thrashing amount */thrashing += prev_thrash_growth;if (max_thrashing < thrashing) {max_thrashing = thrashing;}if (zoneinfo_parse(&zi) < 0) {ALOGE("Failed to parse zoneinfo!");return;}calc_zone_watermarks(&zi, &zone_mem_info, pgskip_deltas);/* Find out which watermark is breached if any */wmark = get_lowest_watermark(&mi, &zone_mem_info, level, events, in_compaction);LmkdStub::log_meminfo(&mi, wmark);if (tune_qcom_lmkd && events == 0) {qcom_psi_event = false;}/** TODO: move this logic into a separate function* Decide if killing a process is necessary and record the reason*/if (cycle_after_kill && wmark <= WMARK_LOW) {/** Prevent kills not freeing enough memory which might lead to OOM kill.* This might happen when a process is consuming memory faster than reclaim can* free even after a kill. Mostly happens when running memory stress tests.*/kill_reason = PRESSURE_AFTER_KILL;strlcpy(kill_desc, "min watermark is breached even after kill", sizeof(kill_desc));min_score_adj = PERCEPTIBLE_RECENT_FOREGROUND_APP_ADJ;if (wmark > WMARK_MIN) {min_score_adj = VISIBLE_APP_ADJ;}} else if (reclaim == DIRECT_RECLAIM_THROTTLE) {kill_reason = DIRECT_RECL_AND_THROT;strlcpy(kill_desc, "system processes are being throttled", sizeof(kill_desc));} else if (level == VMPRESS_LEVEL_CRITICAL && wmark <= WMARK_HIGH && qcom_psi_event) {/** Device is too busy reclaiming memory which might lead to ANR.* Critical level is triggered when PSI complete stall (all tasks are blocked because* of the memory congestion) breaches the configured threshold.*/kill_reason = CRITICAL_KILL;strlcpy(kill_desc, "critical pressure and device is low on memory", sizeof(kill_desc));min_score_adj = PERCEPTIBLE_LOW_APP_ADJ + 1;} else if (level == VMPRESS_LEVEL_SUPER_CRITICAL && wmark <= WMARK_HIGH && qcom_psi_event) {kill_reason = NOT_RESPONDING;strlcpy(kill_desc, "device is not responding", sizeof(kill_desc));} else if (swap_is_low && thrashing > thrashing_limit_pct) {/* Page cache is thrashing while swap is low */kill_reason = LOW_SWAP_AND_THRASHING;snprintf(kill_desc, sizeof(kill_desc), "device is low on swap (%" PRId64"kB < %" PRId64 "kB) and thrashing (%" PRId64 "%%)",get_free_swap(&mi) * page_k, swap_low_threshold * page_k, thrashing);/* Do not kill perceptible apps unless below min watermark or heavily thrashing */if (wmark > WMARK_MIN && thrashing < thrashing_critical_pct) {min_score_adj = PERCEPTIBLE_APP_ADJ + 1;}check_filecache = true;} else if (swap_is_low && wmark <= WMARK_HIGH) {/* Both free memory and swap are low */kill_reason = LOW_MEM_AND_SWAP;snprintf(kill_desc, sizeof(kill_desc), "%s watermark is breached and swap is low (%"PRId64 "kB < %" PRId64 "kB)", wmark < WMARK_LOW ? "min" : "low",get_free_swap(&mi) * page_k, swap_low_threshold * page_k);/* Do not kill perceptible apps unless below min watermark or heavily thrashing */if (wmark > WMARK_MIN && thrashing < thrashing_critical_pct) {min_score_adj = PERCEPTIBLE_APP_ADJ + 1;}} else if (wmark < WMARK_HIGH && swap_util_max < 100 &&(swap_util = calc_swap_utilization(&mi)) > swap_util_max) {/** Too much anon memory is swapped out but swap is not low.* Non-swappable allocations created memory pressure.*/kill_reason = LOW_MEM_AND_SWAP_UTIL;snprintf(kill_desc, sizeof(kill_desc), "%s watermark is breached and swap utilization"" is high (%d%% > %d%%)", wmark < WMARK_LOW ? "min" : "low",swap_util, swap_util_max);} else if (wmark <= WMARK_HIGH && thrashing > thrashing_limit) {/* Page cache is thrashing while memory is low */kill_reason = LOW_MEM_AND_THRASHING;snprintf(kill_desc, sizeof(kill_desc), "%s watermark is breached and thrashing (%"PRId64 "%%)", wmark < WMARK_LOW ? "min" : "low", thrashing);cut_thrashing_limit = true;min_score_adj = VISIBLE_APP_ADJ;check_filecache = true;} else if (reclaim == DIRECT_RECLAIM && thrashing > thrashing_limit) {/* Page cache is thrashing while in direct reclaim (mostly happens on lowram devices) */kill_reason = DIRECT_RECL_AND_THRASHING;snprintf(kill_desc, sizeof(kill_desc), "device is in direct reclaim and thrashing (%"PRId64 "%%)", thrashing);cut_thrashing_limit = true;/* Do not kill perceptible apps unless thrashing at critical levels */min_score_adj = PERCEPTIBLE_APP_ADJ + 1;check_filecache = true;} else if (check_filecache) {int64_t file_lru_kb = (vs.field.nr_inactive_file + vs.field.nr_active_file) * page_k;if (file_lru_kb < filecache_min_kb) {/* File cache is too low after thrashing, keep killing background processes */kill_reason = LOW_FILECACHE_AFTER_THRASHING;snprintf(kill_desc, sizeof(kill_desc),"filecache is low (%" PRId64 "kB < %" PRId64 "kB) after thrashing",file_lru_kb, filecache_min_kb);min_score_adj = PERCEPTIBLE_APP_ADJ + 1;} else {/* File cache is big enough, stop checking */check_filecache = false;}} else if (reclaim == DIRECT_RECLAIM && wmark <= WMARK_HIGH) {kill_reason = DIRECT_RECL_AND_LOW_MEM;strlcpy(kill_desc, "device is in direct reclaim and low on memory", sizeof(kill_desc));min_score_adj = PERCEPTIBLE_APP_ADJ;} else if (in_compaction && wmark <= WMARK_HIGH) {kill_reason = COMPACTION;strlcpy(kill_desc, "device is in compaction and low on memory", sizeof(kill_desc));min_score_adj = VISIBLE_APP_ADJ;}/* Kill a process if necessary */if (kill_reason != NONE) {struct kill_info ki = {.kill_reason = kill_reason,.kill_desc = kill_desc,.thrashing = (int)thrashing,.max_thrashing = max_thrashing,};/* Allow killing perceptible apps if the system is stalled */if (critical_stall) {min_score_adj = 0;}psi_parse_io(&psi_data);psi_parse_cpu(&psi_data);// 找到合适的进程杀死int pages_freed = find_and_kill_process(min_score_adj, &ki, &mi, &wi, &curr_tm, &psi_data);if (pages_freed > 0) {killing = true;max_thrashing = 0;/* Killed..Just reduce/increase the boost... */if (kill_reason == CRITICAL_KILL || kill_reason == DIRECT_RECL_AND_THROT) {wbf_effective = std::min(wbf_effective + wbf_step, wmark_boost_factor);} else {wbf_effective = std::max(wbf_effective - wbf_step, 1);}if (cut_thrashing_limit) {/** Cut thrasing limit by thrashing_limit_decay_pct percentage of the current* thrashing limit until the system stops thrashing.*/thrashing_limit = (thrashing_limit * (100 - thrashing_limit_decay_pct)) / 100;}} } else {...}...
}
进程查杀的原因:
查杀原因 | 说明 |
PRESSURE_AFTER_KILL | 执行了一次 kill 操作后&内存水位低于低水位标记 |
CRITICAL_KILL | 内存压力达到了CRITICAL临界值 |
NOT_RESPONDING | 内存压力达到了超级临界级别&当前内存水位低于或等于高水位标记&触发PSI事件 |
LOW_SWAP_AND_THRASHING | 当前抖动程度超过了抖动阈值 |
LOW_MEM_AND_SWAP | swap空间不足&内存水位低于高水位标记 |
LOW_MEM_AND_SWAP_UTIL | 内存水位低于高水位标记&交换空间利用率最大值小于100 &交换空间利用率超过了利用率最大值 |
LOW_MEM_AND_THRASHING | 内存水位低于或等于高水位标记&当前抖动程度超过了抖动阈值 |
DIRECT_RECL_AND_THRASHING | 内核直接从用户空间进程回收内存&当前抖动程度超过了抖动阈值 |
LOW_FILECACHE_AFTER_THRASHING | 计算得到的文件缓存大小 file_lru_kb 小于预设的最小文件缓存大小 filecache_min_kb,即文件缓存过低 |
DIRECT_RECL_AND_LOW_MEM | 内核直接从用户空间进程回收内存&内存水位低于高水位标记 |
COMPACTION | 当前正在进行内存压缩&内存水位低于高水位标记 |
自定义的oom_adj:
ADJ | Value | 说明 |
PRE_PREVIOUS_APP_ADJ | 701 | 前一个应用程序 |
PERCEPTIBLE_LOW_APP_ADJ | 250 | 可感知但优先级较低的应用程序 |
PERCEPTIBLE_APP_ADJ | 200 | 可感的应用程序 |
VISIBLE_APP_ADJ | 100 | 用户可见的应用程序 |
PERCEPTIBLE_RECENT_FOREGROUND_APP_ADJ | 50 | 最近的前台可感知应用程序 |
2.5 进程选取与查杀
主要流程:
1)一些参数和状态的计算
2)根据得出的状态确定查杀原因
3)选择进程进行查杀
2.5.1 进程选取
在系统中找到并杀死一个符合给定 oom_score_adj 级别的进程,以释放内存。该函数通过循环查找并选择合适的进程,如下:
/** Find one process to kill at or above the given oom_score_adj level.* Returns size of the killed process.*/
static int find_and_kill_process(int min_score_adj, struct kill_info *ki, union meminfo *mi,struct wakeup_info *wi, struct timespec *tm,struct psi_data *pd) {int i;int killed_size = 0;bool lmk_state_change_start = false;bool choose_heaviest_task = kill_heaviest_task;for (i = OOM_SCORE_ADJ_MAX; i >= min_score_adj; i--) {struct proc *procp;if (!choose_heaviest_task && i <= PERCEPTIBLE_APP_ADJ) { // 可感的应用程序/** If we have to choose a perceptible process, choose the heaviest one to* hopefully minimize the number of victims.*/choose_heaviest_task = true;}// 可感知的应用程序,走proc_get_heaviest流程。否则,走proc_adj_tail流程while (true) {procp = choose_heaviest_task ?proc_get_heaviest(i) : proc_adj_tail(i);if (!procp)break;// 在2.5.2中进行进程查杀killed_size = kill_one_process(procp, min_score_adj, ki, mi, wi, tm, pd);if (killed_size >= 0) {if (!lmk_state_change_start) {lmk_state_change_start = true;stats_write_lmk_state_changed(STATE_START);}break;}}if (killed_size) {break;}}if (!killed_size && !min_score_adj && is_userdebug_or_eng_build) {killed_size = LmkdStub::proc_get_script();}if (lmk_state_change_start) {stats_write_lmk_state_changed(STATE_STOP);}return killed_size;
}
可感知的应用程序(PRE_PREVIOUS_APP_ADJ、VISIBLE_APP_ADJ、PERCEPTIBLE_RECENT_FOREGROUND_APP_ADJ),走proc_get_heaviest流程来选择进程,如下:
// 从给定的 oomadj 级别中选择内存占用最多的进程
static struct proc *proc_get_heaviest(int oomadj) {struct adjslot_list *head = &procadjslot_list[ADJTOSLOT(oomadj)];struct adjslot_list *curr = head->next;struct proc *maxprocp = NULL;int maxsize = 0;/* Filter out PApps */struct proc *maxprocp_pa = NULL;int maxsize_pa = 0;char *tmp_taskname;char buf[LINE_MAX];// 循环遍历进程链表,对比各进程占用的内存大小,找到内存占用最多的那个进程while (curr != head) {int pid = ((struct proc *)curr)->pid;// 从"/proc/%d/statm"节点中获取rss的信息long tasksize = LmkdStub::proc_get_size(pid);if (tasksize < 0) {struct adjslot_list *next = curr->next;pid_remove(pid);curr = next;} else {tmp_taskname = proc_get_name(pid, buf, sizeof(buf));if (tmp_taskname != NULL &&CamOptLmkdPolicy::getInstance()->is_protected_task(oomadj, tmp_taskname, pid)) {// protect last recent app for camera forground.if (debug_process_killing) {ALOGI("%s, skip %s in oomadj %d", __func__, tmp_taskname, oomadj);}} else if (enable_preferred_apps && tmp_taskname != NULL && strstr(preferred_apps, tmp_taskname)) {if (tasksize > maxsize_pa) {maxsize_pa = tasksize;maxprocp_pa = (struct proc *)curr;}} else {if (tasksize > maxsize) {maxsize = tasksize;maxprocp = (struct proc *)curr;}}curr = curr->next;}}if (maxsize > 0) {return maxprocp;} else if (maxsize_pa > 0) {return maxprocp_pa;}return (struct proc *)CamOptLmkdPolicy::getInstance()->proc_get_heaviest_extend(oomadj, procadjslot_list);
}// 从"/proc/%d/statm"节点中获取rss的信息
long proc_get_rss(int pid) {static char path[PATH_MAX];static char line[LINE_MAX];int fd;long rss = 0;long total;ssize_t ret;/* gid containing AID_READPROC required */snprintf(path, PATH_MAX, "/proc/%d/statm", pid);fd = open(path, O_RDONLY | O_CLOEXEC);if (fd == -1) {return -1;}ret = read_all(fd, line, sizeof(line) - 1);if (ret < 0) {close(fd);return -1;}sscanf(line, "%ld %ld ", &total, &rss);close(fd);return rss;
}
不可感知的或优先级低的应用程序(PRE_PREVIOUS_APP_ADJ、PERCEPTIBLE_LOW_APP_ADJ),走proc_adj_tail流程来选择进程,如下:
// 从procadjslot_list列表的队尾找到目标进程
static struct proc *proc_adj_tail(int oomadj) {return (struct proc *)adjslot_tail(&procadjslot_list[ADJTOSLOT(oomadj)]);
}static struct adjslot_list *adjslot_tail(struct adjslot_list *head) {struct adjslot_list *asl = head->prev;return asl == head ? NULL : asl;
}struct adjslot_list procadjslot_list[ADJTOSLOT_COUNT];
2.5.2 进程查杀
主线程将进程pid、uid先存入queue队列,异步线程从queue中取出pid、uid进行杀进程。
主线程将进程pid、uid先存入queue队列,如下:
static int kill_one_process(struct proc* procp, int min_oom_score, struct kill_info *ki,union meminfo *mi, struct wakeup_info *wi, struct timespec *tm,struct psi_data *pd) {...start_wait_for_proc_kill(pidfd < 0 ? pid : pidfd);// 采用异步查杀进程kill_result = reaper.kill({ pidfd, pid, uid }, false);...
}int Reaper::kill(const struct target_proc& target, bool synchronous) {/* CAP_KILL required */if (target.pidfd < 0) {return ::kill(target.pid, SIGKILL);}// 异步方式查杀进程if (!synchronous && async_kill(target)) {// we assume the kill will be successful and if it fails we will be notifiedreturn 0;}int result = pidfd_send_signal(target.pidfd, SIGKILL, NULL, 0);if (result) {return result;}return 0;
}// 目标进程的pid、uid放入queue_队列,唤醒reaper thread线程
bool Reaper::async_kill(const struct target_proc& target) {...mutex_.lock();active_requests_++;queue_.push_back({ dup(target.pidfd), target.pid, target.uid });// Wake up a reaper threadcond_.notify_one();mutex_.unlock();return true;
}
在init初始化阶段创建子线程,监控queue队列是否为空。如果为空,进入wait,当主线程往queue中push进程数据时,唤醒该子线程。子线程从queue中取出pid、uid进行杀进程,如下:
bool Reaper::init(int comm_fd) {...pthread_create(&thread_pool_[thread_cnt_], NULL, reaper_main, this)...queue_.reserve(thread_cnt_);...
}static void* reaper_main(void* param) {...for (;;) {// 从队列中取出pidtarget = reaper->dequeue_request();// 给目标进程发送SIGKILL信号pidfd_send_signal(target.pidfd, SIGKILL, NULL, 0);...}...
}Reaper::target_proc Reaper::dequeue_request() {struct target_proc target;std::unique_lock<std::mutex> lock(mutex_);// queue队列为空时,进入休眠。否则,从队列中取出uid、pidwhile (queue_.empty()) {cond_.wait(lock);}target = queue_.back();queue_.pop_back();return target;
}
三、关键系统属性
以persist.device_config.lmkd_native.*和 ro.lmk.*为属性前缀。
属性 | 作用 |
low、medium、critical | 设置不同内存压力级别的 `oom_score_adj` 值。 `low` 对应较低的内存压力,`critical` 对应高内存压力。 |
debug | 用于启用或禁用 lmkd 的调试信息输出。 |
critical_upgrade | 用于启用或禁用内存压力升级逻辑,即在特定条件下提高内存压力级别。 |
upgrade_pressure 、downgrade_pressure | 用于设置内存压力升级和降级的阈值。 |
kill_heaviest_task | 用于决定是否优先杀死内存占用最高的任务。 |
kill_timeout_ms | 设置杀死任务的超时时间。 |
use_minfree_levels | 用于决定是否使用 `minfree` 级别的配置。 |
swap_free_low_percentage | 设置交换空间低百分比的阈值,用于判断系统是否处于低交换空间状态 |
psi_partial_stall_ms、psi_complete_stall_ms | 设置 PSI(Pressure Stall Information)部分和完全停滞的阈值,用于监控系统内存压力。 |
thrashing_limit_pct、thrashing_limit_decay_pct | 设置内存抖动的限制和衰减百分比,用于控制内存抖动情况下的处理策略。 |
swap_util_max | 设置交换利用率的最大值,用于限制交换空间的使用。 |
filecache_min_kb | 设置文件缓存的最小值,用于控制文件缓存的大小。 |
四、核心数据结构
// epoll event结构体
struct epoll_event {uint32_t events;epoll_data_t data;
}// 内存回收的水位
enum zone_watermark {WMARK_MIN = 0,WMARK_LOW,WMARK_HIGH,WMARK_NONE
};// 用于存放进程信息
struct proc {struct adjslot_list asl;int pid;int pidfd;uid_t uid;int oomadj;pid_t reg_pid; /* PID of the process that registered this record */bool valid;struct proc *pidhash_next;
};// 存放struct proc的结构体
struct adjslot_list {struct adjslot_list *next;struct adjslot_list *prev;
};// 用于设置lmkd socket的epoll监听的sock信息封装
struct sock_event_handler_info {int sock;pid_t pid;uint32_t async_event_mask;struct event_handler_info handler_info;
};
五、代码时序
代码时序包括三部分:
1)AMS客户端将进程信息(pid、oom_adj)写入lmkd socket
2)lmkd初始化,包括设置lmkd socket的epoll监听与注册回调函数、psi epoll监听及注册回调函数
3)进入loop循环监听psi epoll,事件发生时,回调执行注册函数,选取目标进程&查杀
相关文章:

Android lmkd机制详解
目录 一、lmkd介绍 二、lmkd实现原理 2.1 工作原理图 2.2 初始化 2.3 oom_adj获取 2.4 监听psi事件及处理 2.5 进程选取与查杀 2.5.1 进程选取 2.5.2 进程查杀 三、关键系统属性 四、核心数据结构 五、代码时序 一、lmkd介绍 Android lmkd采用epoll方式监听linux内…...
linux shell(中)
结构化命令 if语句 if-then 最基本的结构化命令是 if-then 语句。if-then 语句的格式如下: if command thencommands ifif command; then # 通过把分号(;)放在待求值的命令尾部,可以将 then 语句写在同一行commands ifbash sh…...

VMware三种网络模式---巨细
文章目录 目录 ‘一.网络模式概述 二.桥接模式 二.NAT模式 三.仅主机模式 四.案例演示 防火墙配置: 虚拟电脑配置 前言 本文主要介绍VMware的三种网络模式 ‘一.网络模式概述 VMware中分为三种网络模式: 桥接模式:默认与宿主机VMnet0绑…...

力扣高频SQL 50 题(基础版)第一题
文章目录 力扣高频SQL 50 题(基础版)第一题1757.可回收且低脂的产品题目说明思路分析实现过程准备数据:实现方式:结果截图: 力扣高频SQL 50 题(基础版)第一题 1757.可回收且低脂的产品 题目说…...

2.1.卷积层
卷积 用MLP处理图片的问题:假设一张图片有12M像素,那么RGB图片就有36M元素,使用大小为100的单隐藏层,模型有3.6B元素,这个数量非常大。 识别模式的两个原则: 平移不变性(translation inva…...

网易《永劫无间》手游上线,掀起游戏界狂潮
原标题:网易《永劫无间》手游上线,网友:发烧严重 易采游戏网7月26日消息:自网易宣布《永劫无间》手游即将上线以来,广大游戏玩家的期待值就不断攀升。作为一款拥有丰富内容和极高自由度的游戏,《永劫无间》…...

RNN(一)——循环神经网络的实现
文章目录 一、循环神经网络RNN1.RNN是什么2.RNN的语言模型3.RNN的结构形式 二、完整代码三、代码解读1.参数return_sequences2.调参过程 一、循环神经网络RNN 1.RNN是什么 循环神经网络RNN主要体现在上下文对理解的重要性,他比传统的神经网络(传统的神…...
php 根据位置的经纬度计算距离
在开发中,我们要经常和位置打交道,要计算附近的位置、距离什么的。如下: 一.sql语句 SELECT houseID,title,location,chamber,room,toward,area,rent,is_verify,look_type,look_time, traffic,block_name,images,tag,create_time,update_time, location->&g…...

17 Python常用内置函数——基本输入输出
input() 和 print() 是 Python 的基本输入输出函数,前者用来接收用户的键盘输入,后者用来把数据以指定的格式输出到标准控制台或指定的文件对象。无论用户输入什么内容,input() 一律作为字符串对待,必要时可以使用内置函数 int()、…...

【Web】LitCTF 2024 题解(全)
目录 浏览器也能套娃? 一个....池子? 高亮主题(划掉)背景查看器 百万美元的诱惑 SAS - Serializing Authentication exx 浏览器也能套娃? 随便试一试,一眼ssrf file:///flag直接读本地文件 一个....池子? {…...

家政项目小程序的设计
管理员账户功能包括:系统首页,个人中心,用户管理,家政人员管理,家政服务管理,咨询信息管理,咨询服务管理,家政预约管理,留言板管理,系统管理 微信端账号功能…...

electron TodoList网页应用打包成linux deb、AppImage应用
这里用的是windows的wsl的ubuntu环境 electron应用打包linux应用需要linux下打包,这里用windows的wsl的ubuntu环境进行操作 1)linux ubuntu安装nodejs、electron 安装nodejs: sudo apt update sudo apt upgrade ##快捷安装 curl -fsSL http…...

【C语言】 使用fgets和fputs完成两个文件的拷贝
目录 1、使用fgets和fputs完成两个文件的拷贝 2、使用fgets统计给定文件的行号 fgets和fputs的使用方法函数原型:int fputs(const char *s,FILE *stream); char *fgets(char *s,int size,FILE *stream);fupts…...

使用PyTorch导出JIT模型:C++ API与libtorch实战
PyTorch导出JIT模型并用C API libtorch调用 本文将介绍如何将一个 PyTorch 模型导出为 JIT 模型并用 PyTorch 的 CAPI libtorch运行这个模型。 Step1:导出模型 首先我们进行第一步,用 Python API 来导出模型,由于本文的重点是在后面的部署…...

Python——异常捕获,传递及其抛出操作
01. 异常的概念 1. 程序在运行时,如果 python解释器遇到一个错误,会停止程序的执行,并且提示一些错误信息,这就是异常。 2. 程序停止执行并且提示错误信息这个动作,我们通常称之为:抛出(raise…...
【Maven】 的继承机制
Maven是一个强大的项目管理工具,主要用于Java项目的构建和管理。它以其项目对象模型(POM)为基础,允许开发者定义项目的依赖、构建过程和插件。Maven的继承机制是其核心特性之一,它允许子项目继承和复用父项目的配置&am…...
微信小程序结合后端php发送模版消息
前端: <view class"container"><button bindtap"requestSubscribeMessage">订阅消息</button> </view> // index.js Page({data: {tmplIds: [UTgCUfsjHVESf5FjOzls0I9i_FVS1N620G2VQCg1LZ0] // 使用你的模板ID},requ…...
sqlalchemy报错sqlalchemy.orm.exc.DetachedInstanceError
解决方案: 在初始化数据库的代码中,将 maker sessionmaker(bindeng)修改为 maker sessionmaker(bindeng, expire_on_commitFalse)为什么要添加 expire_on_commitFalse 参数? expire_on_commit 可以用来更改 SQLAlchemy 的对象刷新机制&…...

华为网络模拟器eNSP安装部署教程
eNSP是图形化网络仿真平台,该平台通过对真实网络设备的仿真模拟,帮助广大ICT从业者和客户快速熟悉华为数通系列产品,了解并掌握相关产品的操作和配置、提升对企业ICT网络的规划、建设、运维能力,从而帮助企业构建更高效࿰…...

【React】详解样式控制:从基础到进阶应用的全面指南
文章目录 一、内联样式1. 什么是内联样式?2. 内联样式的定义3. 基本示例4. 动态内联样式 二、CSS模块1. 什么是CSS模块?2. CSS模块的定义3. 基本示例4. 动态应用样式 三、CSS-in-JS1. 什么是CSS-in-JS?2. styled-components的定义3. 基本示例…...
uniapp 对接腾讯云IM群组成员管理(增删改查)
UniApp 实战:腾讯云IM群组成员管理(增删改查) 一、前言 在社交类App开发中,群组成员管理是核心功能之一。本文将基于UniApp框架,结合腾讯云IM SDK,详细讲解如何实现群组成员的增删改查全流程。 权限校验…...

(LeetCode 每日一题) 3442. 奇偶频次间的最大差值 I (哈希、字符串)
题目:3442. 奇偶频次间的最大差值 I 思路 :哈希,时间复杂度0(n)。 用哈希表来记录每个字符串中字符的分布情况,哈希表这里用数组即可实现。 C版本: class Solution { public:int maxDifference(string s) {int a[26]…...
【碎碎念】宝可梦 Mesh GO : 基于MESH网络的口袋妖怪 宝可梦GO游戏自组网系统
目录 游戏说明《宝可梦 Mesh GO》 —— 局域宝可梦探索Pokmon GO 类游戏核心理念应用场景Mesh 特性 宝可梦玩法融合设计游戏构想要素1. 地图探索(基于物理空间 广播范围)2. 野生宝可梦生成与广播3. 对战系统4. 道具与通信5. 延伸玩法 安全性设计 技术选…...

Mac下Android Studio扫描根目录卡死问题记录
环境信息 操作系统: macOS 15.5 (Apple M2芯片)Android Studio版本: Meerkat Feature Drop | 2024.3.2 Patch 1 (Build #AI-243.26053.27.2432.13536105, 2025年5月22日构建) 问题现象 在项目开发过程中,提示一个依赖外部头文件的cpp源文件需要同步,点…...

C# 求圆面积的程序(Program to find area of a circle)
给定半径r,求圆的面积。圆的面积应精确到小数点后5位。 例子: 输入:r 5 输出:78.53982 解释:由于面积 PI * r * r 3.14159265358979323846 * 5 * 5 78.53982,因为我们只保留小数点后 5 位数字。 输…...

Linux nano命令的基本使用
参考资料 GNU nanoを使いこなすnano基础 目录 一. 简介二. 文件打开2.1 普通方式打开文件2.2 只读方式打开文件 三. 文件查看3.1 打开文件时,显示行号3.2 翻页查看 四. 文件编辑4.1 Ctrl K 复制 和 Ctrl U 粘贴4.2 Alt/Esc U 撤回 五. 文件保存与退出5.1 Ctrl …...
django blank 与 null的区别
1.blank blank控制表单验证时是否允许字段为空 2.null null控制数据库层面是否为空 但是,要注意以下几点: Django的表单验证与null无关:null参数控制的是数据库层面字段是否可以为NULL,而blank参数控制的是Django表单验证时字…...

Axure 下拉框联动
实现选省、选完省之后选对应省份下的市区...

rknn toolkit2搭建和推理
安装Miniconda Miniconda - Anaconda Miniconda 选择一个 新的 版本 ,不用和RKNN的python版本保持一致 使用 ./xxx.sh进行安装 下面配置一下载源 # 清华大学源(最常用) conda config --add channels https://mirrors.tuna.tsinghua.edu.cn…...

企业大模型服务合规指南:深度解析备案与登记制度
伴随AI技术的爆炸式发展,尤其是大模型(LLM)在各行各业的深度应用和整合,企业利用AI技术提升效率、创新服务的步伐不断加快。无论是像DeepSeek这样的前沿技术提供者,还是积极拥抱AI转型的传统企业,在面向公众…...