当前位置: 首页 > news >正文

2022数学建模国赛C题官网展示论文C155论文复现

2022数学建模国赛C题C155论文复现

  • 1.内容比对
  • 2.第一问第二小问复现代码
    • 2.1 页表合并
    • 2.2 数据的正态性检验
      • 2.2.1数据的正态性检验效果图
    • 2.3不满足正态性,进行中心化对数比变换
      • 2.3.1 核心步骤-inf用0值替换
      • 2.3.2中心化对数比变换效果图
    • 2.4描述性统计
    • 2.5 箱线图绘制

github查看完整论文复现过程

1.内容比对

箱线图比对
国赛C155
在这里插入图片描述
复现内容:
在这里插入图片描述

2.第一问第二小问复现代码

import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
plt.rcParams['font.sans-serif'] = ['SimHei']# Load all sheets of the Excel file
xl_file = pd.ExcelFile("E:\\数学建模国赛\\2022数学建模赛题\\C题\\附件.xlsx")# Load individual sheets with correct names
sheet1 = xl_file.parse('表单1')  # 玻璃文物的基本信息
sheet2 = xl_file.parse('表单2')  # 已分类玻璃文物的化学成分比例
sheet3 = xl_file.parse('表单3')  # 未分类玻璃文物的化学成分比例# Show the first few rows of each sheet
sheet1.head(), sheet2.head(), sheet3.head()
(   文物编号 纹饰  类型  颜色 表面风化0     1  C  高钾  蓝绿  无风化1     2  A  铅钡  浅蓝   风化2     3  A  高钾  蓝绿  无风化3     4  A  高钾  蓝绿  无风化4     5  A  高钾  蓝绿  无风化,文物采样点  二氧化硅(SiO2)  氧化钠(Na2O)  氧化钾(K2O)  氧化钙(CaO)  氧化镁(MgO)  氧化铝(Al2O3)  \0     01       69.33        NaN      9.99      6.32      0.87        3.93   1     02       36.28        NaN      1.05      2.34      1.18        5.73   2  03部位1       87.05        NaN      5.19      2.01       NaN        4.06   3  03部位2       61.71        NaN     12.37      5.87      1.11        5.50   4     04       65.88        NaN      9.67      7.12      1.56        6.44   氧化铁(Fe2O3)  氧化铜(CuO)  氧化铅(PbO)  氧化钡(BaO)  五氧化二磷(P2O5)  氧化锶(SrO)  氧化锡(SnO2)  \0        1.74      3.87       NaN       NaN         1.17       NaN        NaN   1        1.86      0.26     47.43       NaN         3.57      0.19        NaN   2         NaN      0.78      0.25       NaN         0.66       NaN        NaN   3        2.16      5.09      1.41      2.86         0.70      0.10        NaN   4        2.06      2.18       NaN       NaN         0.79       NaN        NaN   二氧化硫(SO2)  0       0.39  1        NaN  2        NaN  3        NaN  4       0.36  ,文物编号 表面风化  二氧化硅(SiO2)  氧化钠(Na2O)  氧化钾(K2O)  氧化钙(CaO)  氧化镁(MgO)  氧化铝(Al2O3)  \0   A1  无风化       78.45        NaN       NaN      6.08      1.86        7.23   1   A2   风化       37.75        NaN       NaN      7.63       NaN        2.33   2   A3  无风化       31.95        NaN      1.36      7.19      0.81        2.93   3   A4  无风化       35.47        NaN      0.79      2.89      1.05        7.07   4   A5   风化       64.29        1.2      0.37      1.64      2.34       12.75   氧化铁(Fe2O3)  氧化铜(CuO)  氧化铅(PbO)  氧化钡(BaO)  五氧化二磷(P2O5)  氧化锶(SrO)  氧化锡(SnO2)  \0        2.15      2.11       NaN       NaN         1.06      0.03        NaN   1         NaN       NaN     34.30       NaN        14.27       NaN        NaN   2        7.06      0.21     39.58      4.69         2.68      0.52        NaN   3        6.45      0.96     24.28      8.31         8.45      0.28        NaN   4        0.81      0.94     12.23      2.16         0.19      0.21       0.49   二氧化硫(SO2)  0       0.51  1        NaN  2        NaN  3        NaN  4        NaN  )
sheet2
文物采样点二氧化硅(SiO2)氧化钠(Na2O)氧化钾(K2O)氧化钙(CaO)氧化镁(MgO)氧化铝(Al2O3)氧化铁(Fe2O3)氧化铜(CuO)氧化铅(PbO)氧化钡(BaO)五氧化二磷(P2O5)氧化锶(SrO)氧化锡(SnO2)二氧化硫(SO2)
00169.33NaN9.996.320.873.931.743.87NaNNaN1.17NaNNaN0.39
10236.28NaN1.052.341.185.731.860.2647.43NaN3.570.19NaNNaN
203部位187.05NaN5.192.01NaN4.06NaN0.780.25NaN0.66NaNNaNNaN
303部位261.71NaN12.375.871.115.502.165.091.412.860.700.10NaNNaN
40465.88NaN9.677.121.566.442.062.18NaNNaN0.79NaNNaN0.36
................................................
6454严重风化点17.11NaNNaNNaN1.113.65NaN1.3458.46NaN14.131.12NaNNaN
655549.012.71NaN1.13NaN1.45NaN0.8632.927.950.35NaNNaNNaN
665629.15NaNNaN1.21NaN1.85NaN0.7941.2515.452.54NaNNaNNaN
675725.42NaNNaN1.31NaN2.18NaN1.1645.1017.30NaNNaNNaNNaN
685830.39NaN0.343.490.793.520.863.1339.357.668.990.24NaNNaN

69 rows × 15 columns

component_cols = ['二氧化硅(SiO2)', '氧化钠(Na2O)', '氧化钾(K2O)', '氧化钙(CaO)', '氧化镁(MgO)', '氧化铝(Al2O3)', '氧化铁(Fe2O3)', '氧化铜(CuO)', '氧化铅(PbO)', '氧化钡(BaO)', '五氧化二磷(P2O5)', '氧化锶(SrO)', '氧化锡(SnO2)', '二氧化硫(SO2)']sheet2 ['成分总和'] = sheet2 [component_cols].sum(axis=1)
sheet2 ['成分总和']sheet2 = sheet2[(sheet2['成分总和'] >= 85) & (sheet2['成分总和'] <= 105)]
sheet2
sheet2 = sheet2.fillna(0)
# Normalize the chemical components to sum up to 100%
sheet2[component_cols] = sheet2[component_cols].div(sheet2[component_cols].sum(axis=1), axis=0) * 100sheet2 ['成分总和'] = sheet2 [component_cols].sum(axis=1)
sheet2
文物采样点二氧化硅(SiO2)氧化钠(Na2O)氧化钾(K2O)氧化钙(CaO)氧化镁(MgO)氧化铝(Al2O3)氧化铁(Fe2O3)氧化铜(CuO)氧化铅(PbO)氧化钡(BaO)五氧化二磷(P2O5)氧化锶(SrO)氧化锡(SnO2)二氧化硫(SO2)成分总和
00171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549100.0
10236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000100.0
203部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000100.0
303部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000100.0
40468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766100.0
...................................................
6454严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000100.0
655550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000100.0
665631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000100.0
675727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000100.0
685830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000100.0

67 rows × 16 columns

sheet2_copy = sheet2.copy()
sheet2=sheet2_copy

# Define the new column names
new_component_cols = ['SiO2', 'Na2O', 'K2O', 'CaO', 'MgO', 'Al2O3', 'Fe2O3', 'CuO', 'PbO', 'BaO', 'P2O5', 'SrO', 'SnO2', 'SO2']# Create a mapping from old column names to new column names
rename_dict = dict(zip(component_cols, new_component_cols))# Rename the columns
sheet2.rename(columns=rename_dict, inplace=True)# Check the updated column names
sheet2.columns
Index(['文物采样点', 'SiO2', 'Na2O', 'K2O', 'CaO', 'MgO', 'Al2O3', 'Fe2O3', 'CuO','PbO', 'BaO', 'P2O5', 'SrO', 'SnO2', 'SO2', '成分总和'],dtype='object')

2.1 页表合并

# Merge sheet1 and sheet2 on 文物编号 (artifact number)
# First, we need to extract the 文物编号 from the 文物采样点 in sheet2
# We assume that the 文物编号 is the numeric part before any non-numeric character in the 文物采样点# Import regular expression library
import re# Define a function to extract 文物编号 from 文物采样点
def extract_number(s):match = re.match(r"(\d+)", s)return int(match.group()) if match else None# Apply the function to the 文物采样点 column
sheet2['文物编号'] = sheet2['文物采样点'].apply(extract_number)# Merge sheet1 and sheet2
data = pd.merge(sheet1, sheet2, on='文物编号')
# nan for zerodata
文物编号纹饰类型颜色表面风化文物采样点SiO2Na2OK2OCaO...Al2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2成分总和
01C高钾蓝绿无风化0171.0275590.00000010.2346076.474746...4.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549100.0
12A铅钡浅蓝风化0236.3199520.0000001.0511562.342577...5.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000100.0
23A高钾蓝绿无风化03部位187.0500000.0000005.1900002.010000...4.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000100.0
33A高钾蓝绿无风化03部位262.4089810.00000012.5101135.936489...5.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000100.0
44A高钾蓝绿无风化0468.5821360.00000010.0666257.412034...6.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766100.0
..................................................................
6254C铅钡浅蓝风化54严重风化点17.6537350.0000000.0000000.000000...3.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000100.0
6355C铅钡绿无风化5550.8507992.8117870.0000001.172442...1.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000100.0
6456C铅钡蓝绿风化5631.6023420.0000000.0000001.311795...2.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000100.0
6557C铅钡蓝绿风化5727.4899970.0000000.0000001.416676...2.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000100.0
6658C铅钡NaN风化5830.7715670.0000000.3442693.533819...3.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000100.0

67 rows × 21 columns

data.drop(['颜色','纹饰','文物编号','成分总和'],axis=1,inplace=True)
data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

data.shape
(67, 17)
#data.to_excel('E:\\数学建模国赛\\2022数学建模赛题\\C题\\一二表单合并数据.xlsx', index=True)
data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

2.2 数据的正态性检验

"""
对于某些统计分析,如回归分析,数据的正态性是一种关键的假设。
然而,是否需要进行这种变换取决于数据本身的特性和分析目标。
现在,让我们查看一下数据
对于您的数据,考虑到它是化学成分数据,并且从前面的分析中我们看到数据的分布并不完全是正态的,
我建议在中心化对数比变换后进行分析。这样可以确保数据满足统计分析的假设,并能更好地处理组成数据的特性。
"""
# 正态性检验,查看一下这些化学元素的分布。
import matplotlib.pyplot as plt# Select only the columns that are numeric and not categorical
numeric_cols = data.select_dtypes(include='number').columns

2.2.1数据的正态性检验效果图

# Plot histograms for each numeric column
fig, axs = plt.subplots(len(numeric_cols), figsize=(10, len(numeric_cols)*3))for i, col in enumerate(numeric_cols):axs[i].hist(data[col].dropna(), bins=30, color='skyblue', edgecolor='black', alpha=0.7)axs[i].set_title(f'Histogram of {col}')plt.tight_layout()
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-lY1y3UDY-1692511618307)(output_12_0.png)]

data_raw=data.copy()
data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

"""
正态性检验,们将使用 Shapiro-Wilk 测试来检查每个化学成分的正态性。
这是一种常用的正态性检验方法,它的零假设是数据来自正态分布。
如果 p 值小于 0.05,我们将拒绝零假设,即认为数据不符合正态分布。
"""
from scipy.stats import shapiro, levene# Initialize an empty dataframe to store the test results
test_results = pd.DataFrame()# Loop over each numeric column
for col in numeric_cols[0:]:# Initialize an empty dict to store the results for this variablecol_results = {'Variable': col}# Normality test# Drop NA values before performing the test_, p_normal = shapiro(data[col].dropna())col_results['Normality p-value'] = p_normalcol_results['Normal'] = p_normal > 0.05# Variance equality test (only if the data is normal)if col_results['Normal']:_, p_equal_var = levene(data.loc[data['表面风化'] == '无风化', col].dropna(), data.loc[data['表面风化'] == '风化', col].dropna())col_results['Equal var p-value'] = p_equal_varcol_results['Equal var'] = p_equal_var > 0.05# Append the results to the dataframetest_results = test_results.append(col_results, ignore_index=True)# Now, the test_results dataframe contains the p-values for normality and equal variances
# for each numeric variable, without any transformation applied to the data.
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
C:\Users\chen'bu'rong\AppData\Local\Temp\ipykernel_15024\777781528.py:30: FutureWarning: The frame.append method is deprecated and will be removed from pandas in a future version. Use pandas.concat instead.test_results = test_results.append(col_results, ignore_index=True)
test_results
VariableNormality p-valueNormalEqual var p-valueEqual var
0SiO25.434923e-02True0.009129False
1Na2O5.631047e-13FalseNaNNaN
2K2O2.218287e-13FalseNaNNaN
3CaO8.905178e-06FalseNaNNaN
4MgO1.066307e-05FalseNaNNaN
5Al2O31.085733e-06FalseNaNNaN
6Fe2O31.809425e-09FalseNaNNaN
7CuO3.633815e-09FalseNaNNaN
8PbO7.531955e-04FalseNaNNaN
9BaO7.773099e-08FalseNaNNaN
10P2O54.346846e-09FalseNaNNaN
11SrO6.648307e-06FalseNaNNaN
12SnO28.658932e-17FalseNaNNaN
13SO25.878219e-17FalseNaNNaN
data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

2.3不满足正态性,进行中心化对数比变换

from scipy.stats.mstats import gmeandata_centralized = data.copy()# 选择数值列
numeric_data = data_centralized.select_dtypes(include='number')# 计算每一行的非零元素的几何均值
geo_means = []
for index, row in numeric_data.iterrows():non_zero_values = row[row > 0]geo_mean = gmean(non_zero_values) if len(non_zero_values) > 0 else 1e-6geo_means.append(geo_mean)# 将每个值除以其所在行的非零元素的几何均值,并取对数
for col in numeric_data.columns:data_centralized[col] = np.log(numeric_data[col] / geo_means)data_centralized.head()
D:\py1.1\envs\pytorch\lib\site-packages\pandas\core\arraylike.py:402: RuntimeWarning: divide by zero encountered in logresult = getattr(ufunc, method)(*inputs, **kwargs)
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化013.045978-inf1.1086850.650820-1.3321610.175740-0.6390140.160355-inf-inf-1.035896-inf-inf-2.134508
1铅钡风化022.676664-inf-0.865813-0.064452-0.7490890.831113-0.294026-2.2616772.944652-inf0.357963-2.575334-inf-inf
2高钾无风化03部位13.586159-inf0.766410-0.182189-inf0.520860-inf-1.128785-2.266618-inf-1.295839-inf-inf-inf
3高钾无风化03部位23.090699-inf1.4835270.738107-0.9273870.673001-0.2616390.595531-0.6881580.019074-1.388422-3.334332-inf-inf
4高钾无风化042.968764-inf1.0499570.743836-0.7743860.643457-0.496365-0.439747-inf-inf-1.454794-inf-inf-2.240723

2.3.1 核心步骤-inf用0值替换

# Replace -inf values with NaN for visualization purposes
#plt.rcParams['font.family'] = 'DejaVu Sans'
selected_cols=new_component_cols
data_centralized.replace(-np.inf, 0, inplace=True)
data_centralized
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化013.0459780.0000001.1086850.650820-1.3321610.175740-0.6390140.1603550.0000000.000000-1.0358960.0000000.0-2.134508
1铅钡风化022.6766640.000000-0.865813-0.064452-0.7490890.831113-0.294026-2.2616772.9446520.0000000.357963-2.5753340.00.000000
2高钾无风化03部位13.5861590.0000000.766410-0.1821890.0000000.5208600.000000-1.128785-2.2666180.000000-1.2958390.0000000.00.000000
3高钾无风化03部位23.0906990.0000001.4835270.738107-0.9273870.673001-0.2616390.595531-0.6881580.019074-1.388422-3.3343320.00.000000
4高钾无风化042.9687640.0000001.0499570.743836-0.7743860.643457-0.496365-0.4397470.0000000.000000-1.4547940.0000000.0-2.240723
......................................................
62铅钡风化54严重风化点1.2166070.0000000.0000000.000000-1.518696-0.3283290.000000-1.3303862.4452870.0000001.025244-1.5097270.00.000000
63铅钡无风化552.673354-0.2217220.000000-1.0964530.000000-0.8471070.000000-1.3694932.2754100.854502-2.2684920.0000000.00.000000
64铅钡风化561.7536030.0000000.000000-1.4282310.000000-1.0036660.000000-1.8545742.1007991.118757-0.6866880.0000000.00.000000
65铅钡风化571.3867200.0000000.000000-1.5787890.000000-1.0694910.000000-1.7003961.9600661.0018900.0000000.0000000.00.000000
66铅钡风化582.3163260.000000-2.1765970.152115-1.3335100.160674-1.2486100.0432462.5747090.9382251.098326-2.5249040.00.000000

67 rows × 17 columns

data
类型表面风化文物采样点SiO2Na2OK2OCaOMgOAl2O3Fe2O3CuOPbOBaOP2O5SrOSnO2SO2
0高钾无风化0171.0275590.00000010.2346076.4747460.8913024.0262271.7826043.9647580.0000000.0000001.1986480.0000000.00.399549
1铅钡风化0236.3199520.0000001.0511562.3425771.1812995.7363101.8620480.26028647.4822300.0000003.5739310.1902090.00.000000
2高钾无风化03部位187.0500000.0000005.1900002.0100000.0000004.0600000.0000000.7800000.2500000.0000000.6600000.0000000.00.000000
3高钾无风化03部位262.4089810.00000012.5101135.9364891.1225735.5622982.1844665.1476541.4259712.8923950.7079290.1011330.00.000000
4高钾无风化0468.5821360.00000010.0666257.4120341.6239856.7041432.1444932.2694150.0000000.0000000.8224030.0000000.00.374766
......................................................
62铅钡风化54严重风化点17.6537350.0000000.0000000.0000001.1452743.7659930.0000001.38258460.3177880.00000014.5790341.1555920.00.000000
63铅钡无风化5550.8507992.8117870.0000001.1724420.0000001.5044620.0000000.89230134.1564648.2485990.3631460.0000000.00.000000
64铅钡风化5631.6023420.0000000.0000001.3117950.0000002.0056370.0000000.85646144.72029516.7497832.7536860.0000000.00.000000
65铅钡风化5727.4899970.0000000.0000001.4166760.0000002.3575210.0000001.25446148.77257518.7087700.0000000.0000000.00.000000
66铅钡风化5830.7715670.0000000.3442693.5338190.7999193.5641960.8707983.16929939.8440667.7561779.1028760.2430130.00.000000

67 rows × 17 columns

2.3.2中心化对数比变换效果图

# Visual comparison between raw data and centralized log ratio transformed data for selected columns
plt.rcParams['font.family'] = 'DejaVu Sans'
fig, axs = plt.subplots(len(selected_cols), 2, figsize=(15, len(selected_cols)*3))for i, col in enumerate(selected_cols):# Plot raw dataaxs[i, 0].hist(data_raw[col].dropna(), bins=30, color='skyblue', edgecolor='black', alpha=0.7)axs[i, 0].set_title(f'Raw data: {col}')# Plot centralized log ratio transformed dataaxs[i, 1].hist(data_centralized[col].dropna(), bins=30, color='salmon', edgecolor='black', alpha=0.7)axs[i, 1].set_title(f'Centralized Log Ratio: {col}')plt.tight_layout()
plt.show()

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-wvnwbDlV-1692511618309)(output_21_0.png)]

#data_centralized.to_excel('E:\\数学建模国赛\\2022数学建模赛题\\C题\\一二表单合并对数中心化转换数据.xlsx', index=True)
data=data_centralized
# Count the unique values in the '类型' and '表面风化' columns
glass_types = data['类型'].unique()
weathering_states = data['表面风化'].unique()glass_types, weathering_states
(array(['高钾', '铅钡'], dtype=object), array(['无风化', '风化'], dtype=object))
# Initialize an empty DataFrame to store the results
grouped_stats = pd.DataFrame()component_cols = ['SiO2', 'Na2O', 'K2O', 'CaO', 'MgO', 'Al2O3', 'Fe2O3', 'CuO', 'PbO', 'BaO', 'P2O5', 'SrO', 'SnO2', 'SO2']
# Calculate descriptive statistics for each chemical component
for component in component_cols:component_data = data.groupby(['类型', '表面风化'])[component]stats = component_data.agg(['mean', 'max', 'min', 'std', 'var', 'skew'])stats['kurt'] = component_data.apply(pd.DataFrame.kurt)stats['cv'] = stats['std'] / stats['mean']  # calculate coefficient of variation# Add a level to column namesstats.columns = pd.MultiIndex.from_product([[component], stats.columns])grouped_stats = pd.concat([grouped_stats, stats], axis=1)grouped_stats
SiO2Na2O...SnO2SO2
meanmaxminstdvarskewkurtcvmeanmax...kurtcvmeanmaxminstdvarskewkurtcv
类型表面风化
铅钡无风化3.0137433.8715211.8595240.6461950.417567-0.301305-0.9568150.2144160.0711310.876318...3.253187-2.4419870.0205690.2673960.0000000.0741620.0055003.60555113.0000003.605551
风化2.2423293.937307-0.1313530.9237800.853370-0.5848110.6507070.4119730.0133711.043858...13.632917-3.6649830.0280211.369229-0.7965620.3364510.1131992.1089099.85728012.007019
高钾无风化3.1656873.7122882.2666090.3632050.131918-1.0937263.0365630.114732-0.0135850.320182...12.000000-3.464102-0.5076200.000000-2.2407230.9259010.857292-1.388056-0.011455-1.824002
风化4.1870454.3729773.8304980.1873880.035114-1.7319953.6411360.0447540.0000000.000000...0.000000NaN0.0000000.0000000.0000000.0000000.0000000.0000000.000000NaN

4 rows × 112 columns

# Adjusting the code to avoid renaming columns, instead we will capture the group information in the DataFrame index
tables_dict = {}for glass_type in glass_types:for weathering_state in weathering_states:subset = grouped_stats.loc[glass_type, weathering_state].unstack().Ttable_name = f"{glass_type}_{weathering_state}"tables_dict[table_name] = pd.DataFrame(subset)  # 显式地转换为pd.DataFrame# Looping through the tables_dict and outputting each DataFrametables_dict
{'高钾_无风化':          Al2O3       BaO       CaO       CuO     Fe2O3       K2O       MgO  \cv    0.664393 -1.972230  0.893838 -2.321136 -1.626433  0.473990 -0.700958   kurt -1.409964  3.016385 -0.156702  1.577446  0.472540  1.635379 -1.292382   max   1.508084  0.019074  1.647769  0.595531  0.747950  2.210662  0.000000   mean  0.776104 -0.179823  0.599071 -0.262942 -0.390464  1.145963 -0.674968   min   0.006978 -1.080913 -0.182189 -1.652716 -1.590841  0.000000 -1.332161   skew -0.031480 -1.906416  0.378894 -1.180633 -0.394538 -0.184857  0.061519   std   0.515638  0.354653  0.535473  0.610324  0.635064  0.543175  0.473124   var   0.265882  0.125778  0.286731  0.372495  0.403306  0.295039  0.223846   Na2O      P2O5       PbO       SO2      SiO2       SnO2       SrO  cv   -19.285768 -0.979906 -1.116780 -1.824002  0.114732  -3.464102 -1.050200  kurt   7.015733  0.317255 -1.629147 -0.011455  3.036563  12.000000 -2.376521  max    0.320182  0.526955  0.000000  0.000000  3.712288   0.000000  0.000000  mean  -0.013585 -0.938500 -0.987338 -0.507620  3.165687  -0.007795 -1.723790  min   -0.760277 -2.730275 -2.672140 -2.240723  2.266609  -0.093536 -3.774602  skew  -2.150622  0.057567 -0.552251 -1.388056 -1.093726  -3.464102 -0.037176  std    0.262001  0.919641  1.102639  0.925901  0.363205   0.027002  1.810324  var    0.068645  0.845740  1.215812  0.857292  0.131918   0.000729  3.277274  ,'高钾_风化':          Al2O3  BaO       CaO       CuO     Fe2O3       K2O       MgO  Na2O  \cv    2.498627  NaN -0.962261 -8.191497 -0.250545 -0.997049 -1.572791   NaN   kurt  0.025390  0.0  2.287842  0.619598  1.095297 -0.867476 -1.112631   0.0   max   0.961580  0.0  0.215634  0.477459 -1.341006  0.000000  0.000000   0.0   mean  0.194529  0.0 -0.664817 -0.060020 -1.714985 -0.328478 -0.286859   0.0   min  -0.410081  0.0 -1.760008 -0.889020 -2.470072 -0.824068 -0.983686   0.0   skew  0.669913  0.0 -0.709483 -1.043688 -1.369695 -0.588570 -1.095736   0.0   std   0.486056  0.0  0.639727  0.491651  0.429681  0.327508  0.451170   0.0   var   0.236251  0.0  0.409251  0.241720  0.184626  0.107262  0.203554   0.0   P2O5  PbO  SO2      SiO2  SnO2  SrO  cv   -0.562597  NaN  NaN  0.044754   NaN  NaN  kurt  2.101884  0.0  0.0  3.641136   0.0  0.0  max   0.000000  0.0  0.0  4.372977   0.0  0.0  mean -1.326415  0.0  0.0  4.187045   0.0  0.0  min  -2.178840  0.0  0.0  3.830498   0.0  0.0  skew  1.134407  0.0  0.0 -1.731995   0.0  0.0  std   0.746238  0.0  0.0  0.187388   0.0  0.0  var   0.556871  0.0  0.0  0.035114   0.0  0.0  ,'铅钡_无风化':          Al2O3       BaO       CaO       CuO     Fe2O3       K2O       MgO  \cv    3.716292  0.352188 -0.987216 -1.103642 -2.376125 -0.899079 -1.163923   kurt  0.214284  1.405046 -0.671685 -0.661301  4.165086 -1.951127 -0.717171   max   0.901223  2.031090  0.340114  0.899535  0.554504  0.000000  0.000000   mean  0.138882  1.245669 -0.714861 -0.925721 -0.306467 -1.288085 -0.541147   min  -0.847107  0.260264 -1.990837 -2.580097 -2.264904 -2.915489 -1.822866   skew -0.716711 -0.562582  0.062455  0.086620 -1.989760  0.104047 -0.750761   std   0.516125  0.438710  0.705723  1.021664  0.728205  1.158091  0.629853   var   0.266385  0.192466  0.498044  1.043798  0.530282  1.341175  0.396715   Na2O      P2O5       PbO        SO2      SiO2      SnO2       SrO  cv    3.684555 -0.818040  0.266446   3.605551  0.214416 -2.441987 -0.893422  kurt  8.623783 -1.684970  6.556376  13.000000 -0.956815  3.253187 -2.023534  max   0.876318  0.000000  2.610837   0.267396  3.871521  0.000000  0.000000  mean  0.071131 -1.449052  2.160856   0.020569  3.013743 -0.311426 -1.114090  min  -0.221722 -3.201927  0.468937   0.000000  1.859524 -2.078030 -2.211561  skew  2.741762 -0.069394 -2.363412   3.605551 -0.301305 -2.182647  0.129023  std   0.262087  1.185383  0.575751   0.074162  0.646195  0.760497  0.995352  var   0.068690  1.405133  0.331490   0.005500  0.417567  0.578356  0.990726  ,'铅钡_风化':           Al2O3       BaO       CaO       CuO     Fe2O3       K2O       MgO  \cv   -11.231984  0.609170 -1.725044 -1.063677 -1.111626 -1.185000 -0.978333   kurt  -0.288489 -0.601793 -0.712685 -0.599524 -0.418132 -1.660810 -1.486424   max    2.042802  2.167893  0.497358  0.888513  0.000000  0.000000  0.000000   mean  -0.087576  1.035546 -0.375654 -0.824426 -0.723172 -0.967980 -0.693444   min   -1.826182 -0.181275 -1.877738 -2.764779 -2.575747 -2.970023 -1.841063   skew   0.155720 -0.126683 -0.661062  0.043889 -0.837263 -0.456970 -0.340389   std    0.983655  0.630823  0.648019  0.876923  0.803897  1.147056  0.678420   var    0.967578  0.397938  0.419929  0.768995  0.646251  1.315737  0.460253   Na2O       P2O5       PbO        SO2      SiO2       SnO2       SrO  cv    27.248350 -11.063796  0.221513  12.007019  0.411973  -3.664983 -0.415568  kurt   3.996993   1.120939 -0.753508   9.857280  0.650707  13.632917  1.634188  max    1.043858   1.188784  3.510396   1.369229  3.937307   0.000000  0.000000  mean   0.013371  -0.102296  2.402080   0.028021  2.242329  -0.119384 -1.827413  min   -1.093837  -3.229330  1.389649  -0.796562 -0.131353  -1.944122 -2.930869  skew  -0.038016  -1.253158  0.239303   2.108909 -0.584811  -3.788951  1.424906  std    0.364329   1.131785  0.532092   0.336451  0.923780   0.437542  0.759414  var    0.132736   1.280938  0.283122   0.113199  0.853370   0.191443  0.576710  }
'''
with pd.ExcelWriter('E:\\数学建模国赛\\2022数学建模赛题\\C题\\一二表单合并数据统计性分析.xlsx') as writer:for sheet_name, df in tables_dict.items():df.to_excel(writer, sheet_name=sheet_name,index=True)
'''
"\nwith pd.ExcelWriter('E:\\数学建模国赛\\2022数学建模赛题\\C题\\一二表单合并数据统计性分析.xlsx') as writer:\n    for sheet_name, df in tables_dict.items():\n        df.to_excel(writer, sheet_name=sheet_name,index=True)\n"

2.4描述性统计

tables_dict['高钾_无风化']
Al2O3BaOCaOCuOFe2O3K2OMgONa2OP2O5PbOSO2SiO2SnO2SrO
cv0.664393-1.9722300.893838-2.321136-1.6264330.473990-0.700958-19.285768-0.979906-1.116780-1.8240020.114732-3.464102-1.050200
kurt-1.4099643.016385-0.1567021.5774460.4725401.635379-1.2923827.0157330.317255-1.629147-0.0114553.03656312.000000-2.376521
max1.5080840.0190741.6477690.5955310.7479502.2106620.0000000.3201820.5269550.0000000.0000003.7122880.0000000.000000
mean0.776104-0.1798230.599071-0.262942-0.3904641.145963-0.674968-0.013585-0.938500-0.987338-0.5076203.165687-0.007795-1.723790
min0.006978-1.080913-0.182189-1.652716-1.5908410.000000-1.332161-0.760277-2.730275-2.672140-2.2407232.266609-0.093536-3.774602
skew-0.031480-1.9064160.378894-1.180633-0.394538-0.1848570.061519-2.1506220.057567-0.552251-1.388056-1.093726-3.464102-0.037176
std0.5156380.3546530.5354730.6103240.6350640.5431750.4731240.2620010.9196411.1026390.9259010.3632050.0270021.810324
var0.2658820.1257780.2867310.3724950.4033060.2950390.2238460.0686450.8457401.2158120.8572920.1319180.0007293.277274
tables_dict['高钾_风化']
Al2O3BaOCaOCuOFe2O3K2OMgONa2OP2O5PbOSO2SiO2SnO2SrO
cv2.498627NaN-0.962261-8.191497-0.250545-0.997049-1.572791NaN-0.562597NaNNaN0.044754NaNNaN
kurt0.0253900.02.2878420.6195981.095297-0.867476-1.1126310.02.1018840.00.03.6411360.00.0
max0.9615800.00.2156340.477459-1.3410060.0000000.0000000.00.0000000.00.04.3729770.00.0
mean0.1945290.0-0.664817-0.060020-1.714985-0.328478-0.2868590.0-1.3264150.00.04.1870450.00.0
min-0.4100810.0-1.760008-0.889020-2.470072-0.824068-0.9836860.0-2.1788400.00.03.8304980.00.0
skew0.6699130.0-0.709483-1.043688-1.369695-0.588570-1.0957360.01.1344070.00.0-1.7319950.00.0
std0.4860560.00.6397270.4916510.4296810.3275080.4511700.00.7462380.00.00.1873880.00.0
var0.2362510.00.4092510.2417200.1846260.1072620.2035540.00.5568710.00.00.0351140.00.0
tables_dict['铅钡_无风化']
Al2O3BaOCaOCuOFe2O3K2OMgONa2OP2O5PbOSO2SiO2SnO2SrO
cv3.7162920.352188-0.987216-1.103642-2.376125-0.899079-1.1639233.684555-0.8180400.2664463.6055510.214416-2.441987-0.893422
kurt0.2142841.405046-0.671685-0.6613014.165086-1.951127-0.7171718.623783-1.6849706.55637613.000000-0.9568153.253187-2.023534
max0.9012232.0310900.3401140.8995350.5545040.0000000.0000000.8763180.0000002.6108370.2673963.8715210.0000000.000000
mean0.1388821.245669-0.714861-0.925721-0.306467-1.288085-0.5411470.071131-1.4490522.1608560.0205693.013743-0.311426-1.114090
min-0.8471070.260264-1.990837-2.580097-2.264904-2.915489-1.822866-0.221722-3.2019270.4689370.0000001.859524-2.078030-2.211561
skew-0.716711-0.5625820.0624550.086620-1.9897600.104047-0.7507612.741762-0.069394-2.3634123.605551-0.301305-2.1826470.129023
std0.5161250.4387100.7057231.0216640.7282051.1580910.6298530.2620871.1853830.5757510.0741620.6461950.7604970.995352
var0.2663850.1924660.4980441.0437980.5302821.3411750.3967150.0686901.4051330.3314900.0055000.4175670.5783560.990726
tables_dict['铅钡_风化']
Al2O3BaOCaOCuOFe2O3K2OMgONa2OP2O5PbOSO2SiO2SnO2SrO
cv-11.2319840.609170-1.725044-1.063677-1.111626-1.185000-0.97833327.248350-11.0637960.22151312.0070190.411973-3.664983-0.415568
kurt-0.288489-0.601793-0.712685-0.599524-0.418132-1.660810-1.4864243.9969931.120939-0.7535089.8572800.65070713.6329171.634188
max2.0428022.1678930.4973580.8885130.0000000.0000000.0000001.0438581.1887843.5103961.3692293.9373070.0000000.000000
mean-0.0875761.035546-0.375654-0.824426-0.723172-0.967980-0.6934440.013371-0.1022962.4020800.0280212.242329-0.119384-1.827413
min-1.826182-0.181275-1.877738-2.764779-2.575747-2.970023-1.841063-1.093837-3.2293301.389649-0.796562-0.131353-1.944122-2.930869
skew0.155720-0.126683-0.6610620.043889-0.837263-0.456970-0.340389-0.038016-1.2531580.2393032.108909-0.584811-3.7889511.424906
std0.9836550.6308230.6480190.8769230.8038971.1470560.6784200.3643291.1317850.5320920.3364510.9237800.4375420.759414
var0.9675780.3979380.4199290.7689950.6462511.3157370.4602530.1327361.2809380.2831220.1131990.8533700.1914430.576710
'''
均值(Mean):
SiO2(二氧化硅): 在未风化的玻璃中,高钾玻璃的SiO2含量均值显著高于铅钡玻璃。
然而,风化过程中,两者的差异缩小,可能表明风化过程影响了SiO2的含量。
Al2O3(氧化铝): 未风化的玻璃中,高钾玻璃的氧化铝含量均值大于铅钡玻璃。
风化后,铅钡玻璃的氧化铝含量均值超过高钾玻璃,这可能反映了风化对氧化铝的显著影响。
标准差(Std)和变异系数(CV):
Na2O(氧化钠): 未风化玻璃中,铅钡玻璃的氧化钠含量均值较高,但风化后,高钾玻璃的氧化钠含量均值增加。
这可能表明风化过程改变了氧化钠的分布。
CaO(氧化钙): 在所有条件下,铅钡玻璃的氧化钙含量均值均大于高钾玻璃,反映了铅钡玻璃的特有组成。
偏度(Skew)和峰度(Kurt):
PbO(氧化铅)和BaO(氧化钡): 在高钾和铅钡玻璃之间,这些成分的分布偏度和峰度存在显著差异。
这可能反映了不同类型玻璃的结构差异和风化过程的不同影响。
特定元素观察:
二氧化硅 (SiO2): 未风化的高钾玻璃的二氧化硅含量约为铅钡玻璃的两倍,但风化后,两者的差异减小。
这可能反映了风化对二氧化硅含量的影响。
氧化铝 (Al2O3): 风化可能对氧化铝含量有显著影响,特别是在铅钡玻璃中。
'''
'\n均值(Mean):\nSiO2(二氧化硅): 在未风化的玻璃中,高钾玻璃的SiO2含量均值显著高于铅钡玻璃。\n然而,风化过程中,两者的差异缩小,可能表明风化过程影响了SiO2的含量。\nAl2O3(氧化铝): 未风化的玻璃中,高钾玻璃的氧化铝含量均值大于铅钡玻璃。\n风化后,铅钡玻璃的氧化铝含量均值超过高钾玻璃,这可能反映了风化对氧化铝的显著影响。\n标准差(Std)和变异系数(CV):\nNa2O(氧化钠): 未风化玻璃中,铅钡玻璃的氧化钠含量均值较高,但风化后,高钾玻璃的氧化钠含量均值增加。\n这可能表明风化过程改变了氧化钠的分布。\nCaO(氧化钙): 在所有条件下,铅钡玻璃的氧化钙含量均值均大于高钾玻璃,反映了铅钡玻璃的特有组成。\n偏度(Skew)和峰度(Kurt):\nPbO(氧化铅)和BaO(氧化钡): 在高钾和铅钡玻璃之间,这些成分的分布偏度和峰度存在显著差异。\n这可能反映了不同类型玻璃的结构差异和风化过程的不同影响。\n特定元素观察:\n二氧化硅 (SiO2): 未风化的高钾玻璃的二氧化硅含量约为铅钡玻璃的两倍,但风化后,两者的差异减小。\n这可能反映了风化对二氧化硅含量的影响。\n氧化铝 (Al2O3): 风化可能对氧化铝含量有显著影响,特别是在铅钡玻璃中。\n'

2.5 箱线图绘制

import matplotlib.pyplot as plt  # or another font that supports the special character
import seaborn as sns
plt.rcParams['font.family'] = 'DejaVu Sans'
# Correct the condition for each DataFrame
data_high_potassium_erosion = data[(data['类型'] == '高钾') & (data['表面风化'] == '风化')]
data_high_potassium_no_erosion = data[(data['类型'] == '高钾') & (data['表面风化'] == '无风化')]
data_lead_barium_erosion = data[(data['类型'] == '铅钡') & (data['表面风化'] == '风化')]
data_lead_barium_no_erosion = data[(data['类型'] == '铅钡') & (data['表面风化'] == '无风化')]# Create a new DataFrame for boxplot
boxplot_data_high_potassium_erosion = data_high_potassium_erosion.melt(id_vars=['类型', '表面风化'], value_vars=component_cols)
boxplot_data_high_potassium_no_erosion = data_high_potassium_no_erosion.melt(id_vars=['类型', '表面风化'], value_vars=component_cols)
boxplot_data_lead_barium_erosion = data_lead_barium_erosion.melt(id_vars=['类型', '表面风化'], value_vars=component_cols)
boxplot_data_lead_barium_no_erosion = data_lead_barium_no_erosion.melt(id_vars=['类型', '表面风化'], value_vars=component_cols)
# Set the figure size
plt.figure(figsize=(20, 45))# Create subplots
fig, axs = plt.subplots(2, 2, figsize=(12, 8))# Reorder the data and titles to switch the positions of the plots
data_list = [boxplot_data_lead_barium_erosion, boxplot_data_lead_barium_no_erosion, boxplot_data_high_potassium_erosion, boxplot_data_high_potassium_no_erosion]
titles = ['Lead Barium Glass with Erosion', 'Lead Barium Glass without Erosion', 'High Potassium Glass with Erosion', 'High Potassium Glass without Erosion']# Generate boxplots for each condition
for ax, data, title in zip(axs.flatten(), data_list, titles):sns.boxplot(y='variable', x='value', data=data, ax=ax, orient="h")ax.set_ylabel('Chemical Component')ax.set_xlabel('Content (%)')ax.set_title('{}'.format(title))ax.invert_yaxis() # Invert the y-axis labels# Adjust layout
plt.tight_layout()
plt.show()
<Figure size 2000x4500 with 0 Axes>

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-VjubA0SS-1692511618310)(output_33_1.png)]

'''
铅钡玻璃:
风化前后的变化:中位数下降: 大部分化学成分的中位数在风化过程中有所下降,特别是Al2O3、K2O、SiO2、CaO、MgO、Na2O。
这可能反映了风化过程中这些成分的流失。
离散程度下降: 这些成分的离散程度也在风化过程中减小,表明风化可能使这些成分的含量变得更一致。
特定化学成分观察:铝氧化物 (Al2O3): 风化使铝氧化物的中位数下降,分布变得更紧密。
硅氧化物 (SiO2): 风化使硅氧化物的中位数下降,分布也更紧密。
钾氧化物 (K2O) 和 钠氧化物 (Na2O): 分布变得更紧密,中位数下降。
高钾玻璃:
风化前后的变化:中位数下降: 大部分化学成分的中位数也在风化过程中下降,尤其是K2O和Na2O,与铅钡玻璃相似。
离散程度变化: 不同于铅钡玻璃,某些化学成分的分布在风化后变得更广,例如硅氧化物 (SiO2) 和钾氧化物 (K2O)。
特定化学成分观察:铝氧化物 (Al2O3): 高钾玻璃的铝氧化物分布在风化后变得更广泛。
硅氧化物 (SiO2): 风化过程似乎没有明显改变硅氧化物的中位数,但分布变得更广。
钾氧化物 (K2O) 和 钠氧化物 (Na2O): 中位数大幅下降,分布更广。
总结:
通过这些箱线图,我们可以观察到风化过程对玻璃成分的具体影响。
对于铅钡玻璃和高钾玻璃,风化过程都可能导致某些成分的流失,但具体的影响程度可能会因为玻璃的类型和成分的种类而有所不同。
这些观察有助于我们理解风化对不同类型玻璃化学成分的影响,进而为文物保护和修复提供指导。
'''
'\n铅钡玻璃:\n风化前后的变化:\n\n中位数下降: 大部分化学成分的中位数在风化过程中有所下降,特别是Al2O3、K2O、SiO2、CaO、MgO、Na2O。\n这可能反映了风化过程中这些成分的流失。\n离散程度下降: 这些成分的离散程度也在风化过程中减小,表明风化可能使这些成分的含量变得更一致。\n特定化学成分观察:\n\n铝氧化物 (Al2O3): 风化使铝氧化物的中位数下降,分布变得更紧密。\n硅氧化物 (SiO2): 风化使硅氧化物的中位数下降,分布也更紧密。\n钾氧化物 (K2O) 和 钠氧化物 (Na2O): 分布变得更紧密,中位数下降。\n高钾玻璃:\n风化前后的变化:\n\n中位数下降: 大部分化学成分的中位数也在风化过程中下降,尤其是K2O和Na2O,与铅钡玻璃相似。\n离散程度变化: 不同于铅钡玻璃,某些化学成分的分布在风化后变得更广,例如硅氧化物 (SiO2) 和钾氧化物 (K2O)。\n特定化学成分观察:\n\n铝氧化物 (Al2O3): 高钾玻璃的铝氧化物分布在风化后变得更广泛。\n硅氧化物 (SiO2): 风化过程似乎没有明显改变硅氧化物的中位数,但分布变得更广。\n钾氧化物 (K2O) 和 钠氧化物 (Na2O): 中位数大幅下降,分布更广。\n总结:\n通过这些箱线图,我们可以观察到风化过程对玻璃成分的具体影响。\n对于铅钡玻璃和高钾玻璃,风化过程都可能导致某些成分的流失,但具体的影响程度可能会因为玻璃的类型和成分的种类而有所不同。\n这些观察有助于我们理解风化对不同类型玻璃化学成分的影响,进而为文物保护和修复提供指导。\n'

相关文章:

2022数学建模国赛C题官网展示论文C155论文复现

2022数学建模国赛C题C155论文复现 1.内容比对2.第一问第二小问复现代码2.1 页表合并2.2 数据的正态性检验2.2.1数据的正态性检验效果图 2.3不满足正态性&#xff0c;进行中心化对数比变换2.3.1 核心步骤-inf用0值替换2.3.2中心化对数比变换效果图 2.4描述性统计2.5 箱线图绘制 …...

阿里云 K8s PVC 绑定 StorageClass 申领 PV 失败

错误场景: 因为阿里云没有默认的 StorageClass 我也懒得更新&#xff0c;所以就创建了一个类型是云盘的 StorageClass。 但是在创建 PVC 之后发现一直是 Pending 状态就查询了一下日志&#xff0c;然后看到很多下面这种错误 liuduiMacBookM1Pro ~ % kubectl describe pvc graf…...

php数组

php数组是什么&#xff1f; 可以使用单个变量名存储多个不同类型的数据的特殊变量&#xff0c;这就是php数组。 php数组就是一个特殊的变量&#xff0c;它允许存储多个任意类型的数据。 创建数组 php数组有两种类型&#xff0c;分为索引数组和关联数组。 创建数组的方法有三种…...

构造不包含字母和数字的webshell

构造不包含字母和数字的webshell <?php echo "A"^""; ?> 输出的结果是字符"!“。之所以会得到这样的结果&#xff0c;是因为代码中对字符"A"和字符”"进行了异或操作。在PHP中&#xff0c;两个变量进行异或时&#xff0c;…...

中国大学生服务外包创新创业大赛丨借 AI 之力,助“记账”难题

一、中国大学生服务外包创新创业大赛 赛事介绍 中国大学生服务外包创新创业大赛&#xff0c;是响应国家关于鼓励服务外包产业发展、加强服务外包人才培养的相关战略举措与号召&#xff0c;举办的每年一届的全国性竞赛。 大赛均由中华人民共和国教育部、中华人民共和国商务部…...

MacOS 安装Redis并设置密码

在开发过程中&#xff0c;需要本地进行安装Redis进行测试&#xff0c;记录了下MacOS环境下安装Redis&#xff0c;以及设置密码。 Brew 安装 $ brew install redis启动服务 # 启动服务 brew services start redis # 关闭服务 brew services stop redis # 重启服务 brew servic…...

函数的参数作为引用

文章目录 1. num,list ,tuple2. list 作为默认值导致共享同一列表3. 防御可变参数4. 结论 1. num,list ,tuple 结论&#xff1a;num ,tuple 作为参数&#xff0c;自身不会因为函数的原因而改变&#xff0c;list 为可变量&#xff0c;会因为函数变而变。 测试 def f(a, b):a …...

【文化课学习笔记】【化学】非金属及其化合物

【化学】必修一&#xff1a;非金属及其化合物 硅及其化合物 硅单质 物理性质 单晶硅的结构与金刚石类似&#xff0c;为正四面体的立体网状结构。晶体中每个硅原子与其他四个硅原子相连接。\(1\mathrm{mol}\) 硅单质还有 \(\mathrm{2N_A}\) 个 \(\mathrm{Si-Si}\) 键&#xff1b…...

Unity进阶–通过PhotonServer实现联网登录注册功能(客户端)–PhotonServer(三)

文章目录 Unity进阶–通过PhotonServer实现联网登录注册功能(客户端)–PhotonServer(三)前情提要客户端部分 Unity进阶–通过PhotonServer实现联网登录注册功能(客户端)–PhotonServer(三) 前情提要 单例泛型类 using System.Collections; using System.Collections.Generic; …...

步步向前,曙光已现:百度的大模型之路

大模型&#xff0c;是今年全球科技界最火热&#xff0c;最耀眼的关键词。在几个月的狂飙突进中&#xff0c;全球主要科技公司纷纷加入了大模型领域。中国AI产业更是开启了被戏称为“百模大战”的盛况。 但喧嚣与热闹之后&#xff0c;新的问题也随之而来&#xff1a;大模型的力量…...

常见的 Python 错误及其解决方案

此文整理了一些常见的 Python 错误及其解决方案。 1、SyntaxError: invalid syntax 说明&#xff1a;无效的语法是最常见的错误之一&#xff0c;通常是由于编写代码时违反了 Python 的语法规则。可能的原因&#xff1a; 忘记在 if、while、for 等语句后写冒号&#xff0c;或者…...

文章评论以及回复评论邮件通知(Go 搭建 qiucode.cn 之八)

要说到评论。无疑是博客应用的灵魂所在了,它也正是站长与博友、博友与博友之间互相交流的桥梁,倘若少了它,博客应用将变得暗淡无关,索然无味,恍如一具躺在床榻上的植物人,终究是无法与周边人言语的。 也正是有了评论,站长在该博客应用所发表的博文,博友才得以通过评论,…...

java面试基础 -- ArrayList 和 LinkedList有什么区别, ArrayList和Vector呢?

目录 基本介绍 有什么不同?? ArrayList的扩容机制 ArrayLIst的基本使用 ArrayList和Vector 基本介绍 还记得我们的java集合框架吗, 我们来复习一下, 如图: 可以看出来 ArrayList和LinkedList 都是具体类, 他们都是接口List的实现类. 但是他们底层的逻辑是不同的, 相信…...

matlab 点云最小二乘拟合空间直线(方法一)

目录 一、算法原理1、空间直线2、最小二乘法拟合二、代码实现三、结果展示四、可视化参考本文由CSDN点云侠原创,原文链接。如果你不是在点云侠的博客中看到该文章,那么此处便是不要脸的爬虫。 一、算法原理 1、空间直线 x...

详解junit

目录 1.概述 2.断言 3.常用注解 3.1.Test 3.2.Before 3.3.After 3.4.BeforeClass 3.5.AfterClass 4.异常测试 5.超时测试 6.参数化测试 1.概述 什么是单元测试&#xff1a; 单元测试&#xff0c;是针对最小的功能单元编写测试代码&#xff0c;在JAVA中最小的功能单…...

Nginx的安装及负载均衡搭建

一.Nginx的安装 1&#xff09;准备安装环境 yum install -y make gcc gcc-c pcre-devel pcre zlib zlib-devel openssl openssl-develPERE PCRE(Perl Compatible Regular Expressions)是一个Perl库&#xff0c;包括 perl 兼容的正则表达式库。 nginx的http模块使用pcre来解…...

JVM学习笔记(一)

1. JVM快速入门 从面试开始&#xff1a; 请谈谈你对JVM 的理解&#xff1f;java8 的虚拟机有什么更新&#xff1f; 什么是OOM &#xff1f;什么是StackOverflowError&#xff1f;有哪些方法分析&#xff1f; JVM 的常用参数调优你知道哪些&#xff1f; 内存快照抓取和MAT分…...

fastjson 序列化问题:Comparison method violates its general contract

fastjson 序列化问题&#xff1a;Comparison method violates its general contract 问题重现 今天在测试接口的时候&#xff0c;调用了Mybatis Plus 分页查询的接口&#xff0c;然后将查询的结果转换成 Json字符串的形式&#xff0c;结果报了这个错误&#xff1a; java.lang.…...

Angular安全专辑之二——‘unsafe-eval’不是以下内容安全策略中允许的脚本源

一&#xff1a;错误出现 这个错误的意思是&#xff0c;拒绝将字符串评估为 JavaScript&#xff0c;因为‘unsafe-eval’不是以下内容安全策略中允许的脚本源。 二&#xff1a;错误场景 testEval() {const data eval("var sum2 new Function(a, b, return a b); sum2(em…...

十一、Linux用户及用户组的权限信息如何查看?如何修改?什么是权限的数字序号?

目录&#xff1a; 1、认知权限信息 2、rwx&#xff1f; &#xff08;1&#xff09;总括&#xff1a; &#xff08;2&#xff09;r权限&#xff1a; &#xff08;3&#xff09;w权限&#xff1a; &#xff08;4&#xff09;x权限&#xff1a; 3、修改权限 &#xff08;1&a…...

conda相比python好处

Conda 作为 Python 的环境和包管理工具&#xff0c;相比原生 Python 生态&#xff08;如 pip 虚拟环境&#xff09;有许多独特优势&#xff0c;尤其在多项目管理、依赖处理和跨平台兼容性等方面表现更优。以下是 Conda 的核心好处&#xff1a; 一、一站式环境管理&#xff1a…...

树莓派超全系列教程文档--(61)树莓派摄像头高级使用方法

树莓派摄像头高级使用方法 配置通过调谐文件来调整相机行为 使用多个摄像头安装 libcam 和 rpicam-apps依赖关系开发包 文章来源&#xff1a; http://raspberry.dns8844.cn/documentation 原文网址 配置 大多数用例自动工作&#xff0c;无需更改相机配置。但是&#xff0c;一…...

【入坑系列】TiDB 强制索引在不同库下不生效问题

文章目录 背景SQL 优化情况线上SQL运行情况分析怀疑1:执行计划绑定问题?尝试:SHOW WARNINGS 查看警告探索 TiDB 的 USE_INDEX 写法Hint 不生效问题排查解决参考背景 项目中使用 TiDB 数据库,并对 SQL 进行优化了,添加了强制索引。 UAT 环境已经生效,但 PROD 环境强制索…...

服务器硬防的应用场景都有哪些?

服务器硬防是指一种通过硬件设备层面的安全措施来防御服务器系统受到网络攻击的方式&#xff0c;避免服务器受到各种恶意攻击和网络威胁&#xff0c;那么&#xff0c;服务器硬防通常都会应用在哪些场景当中呢&#xff1f; 硬防服务器中一般会配备入侵检测系统和预防系统&#x…...

全面解析各类VPN技术:GRE、IPsec、L2TP、SSL与MPLS VPN对比

目录 引言 VPN技术概述 GRE VPN 3.1 GRE封装结构 3.2 GRE的应用场景 GRE over IPsec 4.1 GRE over IPsec封装结构 4.2 为什么使用GRE over IPsec&#xff1f; IPsec VPN 5.1 IPsec传输模式&#xff08;Transport Mode&#xff09; 5.2 IPsec隧道模式&#xff08;Tunne…...

Rapidio门铃消息FIFO溢出机制

关于RapidIO门铃消息FIFO的溢出机制及其与中断抖动的关系&#xff0c;以下是深入解析&#xff1a; 门铃FIFO溢出的本质 在RapidIO系统中&#xff0c;门铃消息FIFO是硬件控制器内部的缓冲区&#xff0c;用于临时存储接收到的门铃消息&#xff08;Doorbell Message&#xff09;。…...

Maven 概述、安装、配置、仓库、私服详解

目录 1、Maven 概述 1.1 Maven 的定义 1.2 Maven 解决的问题 1.3 Maven 的核心特性与优势 2、Maven 安装 2.1 下载 Maven 2.2 安装配置 Maven 2.3 测试安装 2.4 修改 Maven 本地仓库的默认路径 3、Maven 配置 3.1 配置本地仓库 3.2 配置 JDK 3.3 IDEA 配置本地 Ma…...

高效线程安全的单例模式:Python 中的懒加载与自定义初始化参数

高效线程安全的单例模式:Python 中的懒加载与自定义初始化参数 在软件开发中,单例模式(Singleton Pattern)是一种常见的设计模式,确保一个类仅有一个实例,并提供一个全局访问点。在多线程环境下,实现单例模式时需要注意线程安全问题,以防止多个线程同时创建实例,导致…...

基于 TAPD 进行项目管理

起因 自己写了个小工具&#xff0c;仓库用的Github。之前在用markdown进行需求管理&#xff0c;现在随着功能的增加&#xff0c;感觉有点难以管理了&#xff0c;所以用TAPD这个工具进行需求、Bug管理。 操作流程 注册 TAPD&#xff0c;需要提供一个企业名新建一个项目&#…...

Java毕业设计:WML信息查询与后端信息发布系统开发

JAVAWML信息查询与后端信息发布系统实现 一、系统概述 本系统基于Java和WML(无线标记语言)技术开发&#xff0c;实现了移动设备上的信息查询与后端信息发布功能。系统采用B/S架构&#xff0c;服务器端使用Java Servlet处理请求&#xff0c;数据库采用MySQL存储信息&#xff0…...