当前位置：首页 > news >正文

【阿旭机器学习实战】【35】员工离职率预测---决策树与随机森林预测

news 2025/12/21 15:41:28

【阿旭机器学习实战】系列文章主要介绍机器学习的各种算法模型及其实战案例，欢迎点赞，关注共同学习交流。

本文的主要任务是通过决策树与随机森林模型预测一个员工离职的可能性并帮助人事部门理解员工为何离职。

1.获取数据

关注GZH：阿旭算法与机器学习，回复：“ML35”即可获取本文数据集、源码与项目文档

# 引入工具包
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib as matplot
import seaborn as sns
%matplotlib inline

# 读入数据到Pandas Dataframe "df"
df = pd.read_csv('HR_comma_sep.csv', index_col=None)

2.数据预处理

# 检测是否有缺失数据
df.isnull().any()

satisfaction_level       False
last_evaluation          False
number_project           False
average_montly_hours     False
time_spend_company       False
Work_accident            False
left                     False
promotion_last_5years    False
sales                    False
salary                   False
dtype: bool

# 数据的样例
df.head()

	satisfaction_level	last_evaluation	number_project	average_montly_hours	time_spend_company	left	sales	salary
0	0.38	0.53	2	157	3	1	sales	low
1	0.80	0.86	5	262	6	1	sales	medium
2	0.11	0.88	7	272	4	1	sales	medium
3	0.72	0.87	5	223	5	1	sales	low
4	0.37	0.52	2	159	3	1	sales	low

注:“turnover”列为标签:1表示离职，0表示不离职，其他列均为特征值

# 重命名
df = df.rename(columns={'satisfaction_level': 'satisfaction', 'last_evaluation': 'evaluation','number_project': 'projectCount','average_montly_hours': 'averageMonthlyHours','time_spend_company': 'yearsAtCompany','Work_accident': 'workAccident','promotion_last_5years': 'promotion','sales' : 'department','left' : 'turnover'})

# 将预测标签‘是否离职’放在第一列
front = df['turnover']
df.drop(labels=['turnover'], axis=1, inplace = True)
df.insert(0, 'turnover', front)
df.head()

	turnover	satisfaction	evaluation	projectCount	averageMonthlyHours	yearsAtCompany	department	salary
0	1	0.38	0.53	2	157	3	sales	low
1	1	0.80	0.86	5	262	6	sales	medium
2	1	0.11	0.88	7	272	4	sales	medium
3	1	0.72	0.87	5	223	5	sales	low
4	1	0.37	0.52	2	159	3	sales	low

3.分析数据

14999 条数据, 每一条数据包含 10 个特征
总的离职率： 24%
平均满意度为 0.61

df.shape

(14999, 10)

# 特征数据类型. 
df.dtypes

turnover                 int64
satisfaction           float64
evaluation             float64
projectCount             int64
averageMonthlyHours      int64
yearsAtCompany           int64
workAccident             int64
promotion                int64
department              object
salary                  object
dtype: object

turnover_rate = df.turnover.value_counts() / len(df)
turnover_rate

0    0.761917
1    0.238083
Name: turnover, dtype: float64

# 显示统计数据
df.describe()

	turnover	satisfaction	evaluation	projectCount	averageMonthlyHours	yearsAtCompany	workAccident	promotion
count	14999.000000	14999.000000	14999.000000	14999.000000	14999.000000	14999.000000	14999.000000	14999.000000
mean	0.238083	0.612834	0.716102	3.803054	201.050337	3.498233	0.144610	0.021268
std	0.425924	0.248631	0.171169	1.232592	49.943099	1.460136	0.351719	0.144281
min	0.000000	0.090000	0.360000	2.000000	96.000000	2.000000	0.000000	0.000000
25%	0.000000	0.440000	0.560000	3.000000	156.000000	3.000000	0.000000	0.000000
50%	0.000000	0.640000	0.720000	4.000000	200.000000	3.000000	0.000000	0.000000
75%	0.000000	0.820000	0.870000	5.000000	245.000000	4.000000	0.000000	0.000000
max	1.000000	1.000000	1.000000	7.000000	310.000000	10.000000	1.000000	1.000000

# 分组的平均数据统计
turnover_Summary = df.groupby('turnover')
turnover_Summary.mean()

	satisfaction	evaluation	projectCount	averageMonthlyHours	yearsAtCompany	workAccident	promotion
turnover
0	0.666810	0.715473	3.786664	199.060203	3.380032	0.175009	0.026251
1	0.440098	0.718113	3.855503	207.419210	3.876505	0.047326	0.005321

3.1 相关性分析

# 相关性矩阵
corr = df.corr()
#corr = (corr)
sns.heatmap(corr, xticklabels=corr.columns.values,yticklabels=corr.columns.values)corr

	turnover	satisfaction	evaluation	projectCount	averageMonthlyHours	yearsAtCompany	workAccident	promotion
turnover	1.000000	-0.388375	0.006567	0.023787	0.071287	0.144822	-0.154622	-0.061788
satisfaction	-0.388375	1.000000	0.105021	-0.142970	-0.020048	-0.100866	0.058697	0.025605
evaluation	0.006567	0.105021	1.000000	0.349333	0.339742	0.131591	-0.007104	-0.008684
projectCount	0.023787	-0.142970	0.349333	1.000000	0.417211	0.196786	-0.004741	-0.006064
averageMonthlyHours	0.071287	-0.020048	0.339742	0.417211	1.000000	0.127755	-0.010143	-0.003544
yearsAtCompany	0.144822	-0.100866	0.131591	0.196786	0.127755	1.000000	0.002120	0.067433
workAccident	-0.154622	0.058697	-0.007104	-0.004741	-0.010143	0.002120	1.000000	0.039245
promotion	-0.061788	0.025605	-0.008684	-0.006064	-0.003544	0.067433	0.039245	1.000000

请添加图片描述

正相关的特征:

projectCount VS evaluation: 0.349333
projectCount VS averageMonthlyHours: 0.417211
averageMonthlyHours VS evaluation: 0.339742

负相关的特征:

satisfaction VS turnover: -0.388375

# 比较离职和未离职员工的满意度
emp_population = df['satisfaction'][df['turnover'] == 0].mean()
emp_turnover_satisfaction = df[df['turnover']==1]['satisfaction'].mean()print( '未离职员工满意度: ' + str(emp_population))
print( '离职员工满意度: ' + str(emp_turnover_satisfaction) )

未离职员工满意度: 0.666809590479516
离职员工满意度: 0.44009801176140917

3.2 进行 T-Test

进行一个 t-test, 看离职员工的满意度是不是和未离职员工的满意度明显不同

import scipy.stats as stats
stats.ttest_1samp(a = df[df['turnover']==1]['satisfaction'], # 离职员工的满意度样本popmean = emp_population)  # 未离职员工的满意度均值

Ttest_1sampResult(statistic=-51.3303486754725, pvalue=0.0)

T-Test 显示pvalue (0) 非常小, 所以他们之间是显著不同的

degree_freedom = len(df[df['turnover']==1])LQ = stats.t.ppf(0.025,degree_freedom)  # 95%致信区间的左边界RQ = stats.t.ppf(0.975,degree_freedom)  # 95%致信区间的右边界print ('The t-分布 左边界: ' + str(LQ))
print ('The t-分布 右边界: ' + str(RQ))

The t-分布 左边界: -1.9606285215955626
The t-分布 右边界: 1.9606285215955621

# 概率密度函数估计
fig = plt.figure(figsize=(15,4),)
ax=sns.kdeplot(df.loc[(df['turnover'] == 0),'evaluation'] , color='b',shade=True,label='no turnover')
ax=sns.kdeplot(df.loc[(df['turnover'] == 1),'evaluation'] , color='r',shade=True, label='turnover')
ax.set(xlabel='Employee Evaluation', ylabel='Frequency')
ax.legend()
plt.title('Employee Evaluation Distribution - Turnover V.S. No Turnover')

Text(0.5, 1.0, 'Employee Evaluation Distribution - Turnover V.S. No Turnover')

在这里插入图片描述

# 概率密度函数估计
fig = plt.figure(figsize=(15,4))
ax=sns.kdeplot(df.loc[(df['turnover'] == 0),'averageMonthlyHours'] , color='b',shade=True, label='no turnover')
ax=sns.kdeplot(df.loc[(df['turnover'] == 1),'averageMonthlyHours'] , color='r',shade=True, label='turnover')
ax.legend()
ax.set(xlabel='Employee Average Monthly Hours', ylabel='Frequency')
plt.title('Employee AverageMonthly Hours Distribution - Turnover V.S. No Turnover')

Text(0.5, 1.0, 'Employee AverageMonthly Hours Distribution - Turnover V.S. No Turnover')

在这里插入图片描述

# 概率密度函数估计
fig = plt.figure(figsize=(15,4))
ax=sns.kdeplot(df.loc[(df['turnover'] == 0),'satisfaction'] , color='b',shade=True, label='no turnover')
ax=sns.kdeplot(df.loc[(df['turnover'] == 1),'satisfaction'] , color='r',shade=True, label='turnover')
plt.title('Employee Satisfaction Distribution - Turnover V.S. No Turnover')
ax.legend()

<matplotlib.legend.Legend at 0x281a5a6b820>

在这里插入图片描述

from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report, precision_score, recall_score, confusion_matrix, precision_recall_curve

# 将string类型转换为整数类型
df["department"] = df["department"].astype('category').cat.codes
df["salary"] = df["salary"].astype('category').cat.codes# 产生X, y
target_name = 'turnover'
X = df.drop('turnover', axis=1)
y = df[target_name]# 将数据分为训练和测试数据集
# 注意参数 stratify = y 意味着在产生训练和测试数据中, 离职的员工的百分比等于原来总的数据中的离职的员工的百分比
X_train, X_test, y_train, y_test = train_test_split(X,y,test_size=0.15, random_state=123, stratify=y)df.head()

	turnover	satisfaction	evaluation	projectCount	averageMonthlyHours	yearsAtCompany	department	salary
0	1	0.38	0.53	2	157	3	7	1
1	1	0.80	0.86	5	262	6	7	2
2	1	0.11	0.88	7	272	4	7	2
3	1	0.72	0.87	5	223	5	7	1
4	1	0.37	0.52	2	159	3	7	1

4. 建立预测模型：Decision Tree V.S. Random Forest

from sklearn.metrics import roc_auc_score
from sklearn.metrics import classification_report
from sklearn.ensemble import RandomForestClassifier
from sklearn import tree
from sklearn.tree import DecisionTreeClassifier# 决策树
dtree = tree.DecisionTreeClassifier(criterion='entropy',#max_depth=3, # 定义树的深度, 可以用来防止过拟合min_weight_fraction_leaf=0.01 # 定义叶子节点最少需要包含多少个样本(使用百分比表达), 防止过拟合)
dtree = dtree.fit(X_train,y_train)
print ("\n\n ---决策树---")
dt_roc_auc = roc_auc_score(y_test, dtree.predict(X_test))
print ("决策树 AUC = %2.2f" % dt_roc_auc)
print(classification_report(y_test, dtree.predict(X_test)))# 随机森林
rf = RandomForestClassifier(criterion='entropy',n_estimators=1000, max_depth=None, # 定义树的深度, 可以用来防止过拟合min_samples_split=10, # 定义至少多少个样本的情况下才继续分叉#min_weight_fraction_leaf=0.02 # 定义叶子节点最少需要包含多少个样本(使用百分比表达), 防止过拟合)
rf.fit(X_train, y_train)
print ("\n\n ---随机森林---")
rf_roc_auc = roc_auc_score(y_test, rf.predict(X_test))
print ("随机森林 AUC = %2.2f" % rf_roc_auc)
print(classification_report(y_test, rf.predict(X_test)))

 ---决策树---
决策树 AUC = 0.93precision    recall  f1-score   support0       0.97      0.98      0.97      17141       0.93      0.89      0.91       536accuracy                           0.96      2250macro avg       0.95      0.93      0.94      2250
weighted avg       0.96      0.96      0.96      2250---随机森林---
随机森林 AUC = 0.97precision    recall  f1-score   support0       0.98      1.00      0.99      17141       0.99      0.94      0.97       536accuracy                           0.98      2250macro avg       0.99      0.97      0.98      2250
weighted avg       0.98      0.98      0.98      2250

5. 模型评估

5.1ROC 图

# ROC 图
from sklearn.metrics import roc_curve
rf_fpr, rf_tpr, rf_thresholds = roc_curve(y_test, rf.predict_proba(X_test)[:,1])
dt_fpr, dt_tpr, dt_thresholds = roc_curve(y_test, dtree.predict_proba(X_test)[:,1])plt.figure()# 随机森林 ROC
plt.plot(rf_fpr, rf_tpr, label='Random Forest (area = %0.2f)' % rf_roc_auc)# 决策树 ROC
plt.plot(dt_fpr, dt_tpr, label='Decision Tree (area = %0.2f)' % dt_roc_auc)plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Graph')
plt.legend(loc="lower right")
plt.show()

在这里插入图片描述

5.2通过决策树分析不同的特征的重要性

## 画出决策树特征的重要性 ##
importances = rf.feature_importances_
feat_names = df.drop(['turnover'],axis=1).columnsindices = np.argsort(importances)[::-1]
plt.figure(figsize=(12,6))
plt.title("Feature importances by RandomForest")
plt.bar(range(len(indices)), importances[indices], color='lightblue',  align="center")
plt.step(range(len(indices)), np.cumsum(importances[indices]), where='mid', label='Cumulative')
plt.xticks(range(len(indices)), feat_names[indices], rotation='vertical',fontsize=14)
plt.xlim([-1, len(indices)])
plt.show()

请添加图片描述

## 画出决策树的特征的重要性 ##
importances = dtree.feature_importances_
feat_names = df.drop(['turnover'],axis=1).columnsindices = np.argsort(importances)[::-1]
plt.figure(figsize=(12,6))
plt.title("Feature importances by Decision Tree")
plt.bar(range(len(indices)), importances[indices], color='lightblue',  align="center")
plt.step(range(len(indices)), np.cumsum(importances[indices]), where='mid', label='Cumulative')
plt.xticks(range(len(indices)), feat_names[indices], rotation='vertical',fontsize=14)
plt.xlim([-1, len(indices)])
plt.show()

请添加图片描述

如果文章对你有帮助，感谢点赞+关注！

关注下方GZH：阿旭算法与机器学习，回复：“ML35”即可获取本文数据集、源码与项目文档，欢迎共同学习交流

【阿旭机器学习实战】【35】员工离职率预测---决策树与随机森林预测

【阿旭机器学习实战】系列文章主要介绍机器学习的各种算法模型及其实战案例，欢迎点赞，关注共同学习交流。本文的主要任务是通过决策树与随机森林模型预测一个员工离职的可能性并帮助人事部门理解员工为何离职。目录1.获取数据2.数据预处理3.分析数据3.…...

编程日记 2023/2/24 19:41:57

Python学习-----模块4.0（json字符串与json模块）

目录 1.json简介： 2.json对象 3.json模块 （1）json.dumps() 函数 （2）json.dumps() 函数 （3）json.loads() 函数 (4) json.load() 函数 4.总结： 1.json简介： SON(…...

编程日记 2023/2/24 19:40:42

open3d最大平面检测，平面分割

1.点云读入读入文件（配套点云下载链接） # 读取点云 pcd o3d.io.read_point_cloud("point_cloud_00000.ply")配套点云颜色为白色，open3d的点云显示默认背景为白色，所以将点云颜色更改为黑色 pcd.colors o3d.utilit…...

编程日记 2023/2/24 19:39:36

【C++】4.类和对象(下)

1.再谈构造函数 1赋值 class Date { public:Date(int year, int month, int day){_year year;_month month;_day day;}private:int _year;int _month;int _day; };构造函数体中的语句只能将其称作为赋初值，而不能称作初始化。因为初始化只能初始化一次&#xf…...

编程日记 2023/2/24 19:38:31

自动驾驶仿真：ECU TEST 、VTD、VERISTAND连接配置

文章目录一、ECU TEST 连接配置简介二、TBC配置 test bench configuration三、TCF配置 test configuration提示：以下是本篇文章正文内容，下面案例可供参考一、ECU TEST 连接配置简介 1、ECU TEST（简称ET），用于HIL仿…...

编程日记 2023/2/24 19:37:22

postgres数据库连接管理

1.连接命令psql -d postgres -h 10.0.0.51. -p 1921 -U postgres（-d指定数据库名字）2.pg防火墙介绍（pg实例层面的权限控制）pg_hba.conf文件配置文件分为5部分：配置示例#TYPE DATABASE USER ADDRESS METHODhost all loc…...

编程日记 2023/2/24 19:36:17

【华为OD机试模拟题】用 C++ 实现 - 环中最长子串（2023.Q1）

最近更新的博客华为OD机试 - 入栈出栈（C++） | 附带编码思路【2023】华为OD机试 - 箱子之形摆放（C++） | 附带编码思路【2023】华为OD机试 - 简易内存池 2（C++） | 附带编码思路【2023】华为OD机试 - 第 N 个排列（C++） | 附带编码思路【2023】华为OD机试 - 考古…...

编程日记 2023/2/24 19:35:10

Spring：@Async 注解和AsyncResult与CompletableFuture使用

Async概述 Spring中用Async注解标记的方法，称为异步方法，它会在调用方的当前线程之外的独立的线程中执行， 其实就相当于我们自己new Thread(()-> System.out.println("hello world !"))这样在另一个线程中去执行相应的业务逻辑…...

编程日记 2023/2/24 19:34:06

tidb ptca，ptcp考证

PingCAP 认证 TiDB 数据库专员 V6 考试（2023-02-23）https://learn.pingcap.com/learner/exam-market/list?categoryPCTA PingCAP 认证 TiDB 数据库管理专家（PCTP - DBA）认证考试范围指引 - ☄️ 学习与认证 - TiDB 的问答社区:lo…...

编程日记 2023/2/24 19:32:59

关于用windows开发遇到的各种乌龙事件之node版本管理---nvm install node之后 npm 找不到的问题

友情提醒，开发最好用nvm控制node版本 nrm 控制镜像源，能少掉很多头发开发过程中技术迭代更新的时候最要老命的就是历史项目的node版本没有记录，导致开启旧项目的时候就会报错。尤其是npm 升级到8.x.x以后，各种版本不兼容。真…...

编程日记 2023/2/24 19:31:52

JMeter做UI自动化

插件安装搜插件selenium，安装添加config添加线程组右键线程组->添加->配置元件->jpgc - Chrome Driver Configoption和proxy不解释了添加Sampler右键线程组->添加->取样器->jpgc - WebDriver Samplerscript language 选择：JavaScript&…...

编程日记 2023/2/24 19:30:43

Kibana与Elasticsearch

下载与安装Kibanahttps://www.elastic.co/cn/downloads/kibanaKibana的版本与Elasticsearch的版本是一致的，使用方法也和Elasticsearch一致。由于我的英文不是特别好，我们找到config/kibana.yml末尾添加i18n.locale: "zh-CN" ，汉化…...

编程日记 2023/2/24 19:29:33

[数据结构]：03-栈（C语言实现）

目录前言已完成内容单链表实现 01-开发环境 02-文件布局 03-代码 01-主函数 02-头文件 03-StackCommon.cpp 04-StackFunction.cpp 结语前言此专栏包含408考研数据结构全部内容，除其中使用到C引用外，全为C语言代码。使用C引用主要是为了简…...

编程日记 2023/2/24 19:28:24

1W+企业都在用的数字化管理秘籍，快收藏！

企业数字化，绕不开的话题。随着国家相继出台各种举措助力中小企业数字化转型，积极推动产业数字化转型，培育数字经济新生态，企业想要谋生存，求发展，必然需要做好数字化转型和管理。本篇文章想跟大家一起…...

编程日记 2023/2/24 19:26:15

多模态机器学习入门——文献阅读（一）Multimodal Machine Learning: A Survey and Taxonomy

文章目录说明论文阅读AbstractIntroductionIntroduction总结Applications：A Historical Perspective补充与总结3 MULTIMODAL REPRESENTATIONS总结Joint Repersentations（1）总结和附加(一)Joint Repersentations（2）总结…...

编程日记 2023/2/24 19:25:07

通过哲学家进餐问题学习线程间协作(代码实现以leetcode1226为例)

哲学家进餐问题(代码实现以leetcode1226为例)问题场景解决思路解决死锁问题代码实现cgo(代码实现以leetcode1226为例) 提到多线程和锁解决问题，就想到了os中哲学家进餐问题。问题场景回想该问题产生场景，五个哲学家共用一张圆桌，分别坐在…...

编程日记 2023/2/24 19:24:01

消息队列--Kafka

Kafka简介集群部署配置Kafka测试Kafka1.Kafka简介数据缓冲队列。同时提高了可扩展性。具有峰值处理能力，使用消息队列能够使关键组件顶住突发的访问压力，而不会因为突发的超负荷的请求而完全崩溃。 Kafka是一个分布式、支持分区的（partition…...

编程日记 2023/2/24 19:22:55

外盘国际期货：我国当代年轻人结婚逐年下降

我国当代年轻人结婚现状结婚少了结婚晚了 2013年后结婚人数逐年下降结婚少了离婚多了结婚年龄越来越迟以30岁为界线，30岁之后结婚占比逐年增加 2018 20-24岁：435.6万人 25-29岁：736.2万人 30-34岁：314.7万人 35-3…...

编程日记 2023/2/24 19:21:50

Ubuntu 22.04.2 发布，可更新至 Linux Kernel 5.19

Ubuntu 22.04 LTS (Jammy Jellyfish) Ubuntu 22.04.2 发布，可更新至 Linux Kernel 5.19 请访问原文链接：Ubuntu 22.04 LTS (Jammy Jellyfish)，查看最新版。原创作品，转载请保留出处。作者主页：www.sysin.org 发行说…...

编程日记 2023/2/24 19:20:44

论文阅读笔记——《室内服务机器人的实时场景分割算法》

一、主要工作通过深度可分离卷积、膨胀卷积和通道注意力机制设计轻量级的高准确度特征提取模块。融合浅层特征与深层语义特征获得更丰富的图像特征。在NYUDv2和CamVid数据集上的MIoU分别达到72.7%和59.9%，模型的计算力为4.2GFLOPs，参数量为8.3Mb。二…...

编程日记 2023/2/24 19:19:38

日语AI面试高效通关秘籍：专业解读与青柚面试智能助攻

在如今就业市场竞争日益激烈的背景下，越来越多的求职者将目光投向了日本及中日双语岗位。但是，一场日语面试往往让许多人感到步履维艰。你是否也曾因为面试官抛出的“刁钻问题”而心生畏惧？面对生疏的日语交流环境，即便提前恶补了…...

编程新知 2025/12/21 0:29:45

（二）TensorRT-LLM | 模型导出（v0.20.0rc3）

0. 概述上一节对安装和使用有个基本介绍。根据这个 issue 的描述，后续 TensorRT-LLM 团队可能更专注于更新和维护 pytorch backend。但 tensorrt backend 作为先前一直开发的工作，其中包含了大量可以学习的地方。本文主要看看它导出模型的部分&#x…...

编程新知 2025/12/21 0:08:24

深入理解JavaScript设计模式之单例模式

目录什么是单例模式为什么需要单例模式常见应用场景包括单例模式实现透明单例模式实现不透明单例模式用代理实现单例模式javaScript中的单例模式使用命名空间使用闭包封装私有变量惰性单例通用的惰性单例结语什么是单例模式单例模式（Singleton Pattern&#…...

编程新知 2025/12/21 9:09:26

Springcloud：Eureka 高可用集群搭建实战（服务注册与发现的底层原理与避坑指南）

引言：为什么 Eureka 依然是存量系统的核心？ 尽管 Nacos 等新注册中心崛起，但金融、电力等保守行业仍有大量系统运行在 Eureka 上。理解其高可用设计与自我保护机制，是保障分布式系统稳定的必修课。本文将手把手带你搭建生产级 Eur…...

编程新知 2025/12/18 11:07:38

【Java_EE】Spring MVC

目录 Spring Web MVC 编辑注解 RestController RequestMapping RequestParam RequestParam RequestBody PathVariable RequestPart 参数传递注意事项编辑参数重命名 RequestParam 编辑编辑传递集合 RequestParam 传递JSON数据编辑RequestBody …...

编程新知 2025/12/12 6:00:18

聊一聊接口测试的意义有哪些？

目录一、隔离性 & 早期测试二、保障系统集成质量三、验证业务逻辑的核心层四、提升测试效率与覆盖度五、系统稳定性的守护者六、驱动团队协作与契约管理七、性能与扩展性的前置评估八、持续交付的核心支撑接口测试的意义可以从四个维度展开，首…...

编程新知 2025/10/14 6:46:13

Python 包管理器 uv 介绍

Python 包管理器 uv 全面介绍 uv 是由 Astral（热门工具 Ruff 的开发者）推出的下一代高性能 Python 包管理器和构建工具，用 Rust 编写。它旨在解决传统工具（如 pip、virtualenv、pip-tools）的性能瓶颈，同时…...

编程新知 2025/12/12 23:11:21

在Ubuntu24上采用Wine打开SourceInsight

1. 安装wine sudo apt install wine 2. 安装32位库支持，SourceInsight是32位程序 sudo dpkg --add-architecture i386 sudo apt update sudo apt install wine32:i386 3. 验证安装 wine --version 4. 安装必要的字体和库（解决显示问题） sudo apt install fonts-wqy…...

编程新知 2025/10/14 0:47:25

【C++特殊工具与技术】优化内存分配(一)：C++中的内存分配

目录一、C 内存的基本概念 1.1 内存的物理与逻辑结构 1.2 C 程序的内存区域划分二、栈内存分配 2.1 栈内存的特点 2.2 栈内存分配示例三、堆内存分配 3.1 new和delete操作符 4.2 内存泄漏与悬空指针问题 4.3 new和delete的重载四、智能指针…...

编程新知 2025/12/19 23:45:39

基于Java+VUE+MariaDB实现（Web）仿小米商城

仿小米商城环境安装 nodejs maven JDK11 运行 mvn clean install -DskipTestscd adminmvn spring-boot:runcd ../webmvn spring-boot:runcd ../xiaomi-store-admin-vuenpm installnpm run servecd ../xiaomi-store-vuenpm installnpm run serve 注意：运行前…...

编程新知 2025/12/15 17:57:35

目录

1.获取数据

2.数据预处理

3.分析数据

3.1 相关性分析

3.2 进行 T-Test

4. 建立预测模型：Decision Tree V.S. Random Forest

5. 模型评估

5.1ROC 图

5.2通过决策树分析不同的特征的重要性

相关文章：