日韩gay小鲜肉啪啪18禁,日韩超碰人人爽人人做人人添

此圖表通過(guò) SHAP 分析了模型中每個(gè)特征的重要性，特別是在全球（圖a、b）和局部（圖g、h、i）層面的解釋，通過(guò)這個(gè)圖表，讀者能夠清晰地理解每個(gè)特征如何影響預(yù)測(cè)結(jié)果，這里我們主要復(fù)現(xiàn)圖表a、b，a、b圖表原理一樣

目標(biāo)與方法

本文將基于類似的方法，使用帕金森癥數(shù)據(jù)集，通過(guò)隨機(jī)森林模型和 SHAP 工具，生成特征貢獻(xiàn)度的可視化圖表，展示如何使用 SHAP 解釋模型預(yù)測(cè)結(jié)果，將幫助您掌握如何復(fù)現(xiàn)類似的科學(xué)可視化

代碼實(shí)現(xiàn)

數(shù)據(jù)讀取

import pandas as pd

import numpy as np

import matplotlib.pyplot as plt

from sklearn.model_selection import train_test_split

plt.rcParams['font.family'] = 'Times New Roman'

plt.rcParams['axes.unicode_minus'] = False

df = pd.read_excel('Parkinsons Telemonitoring.xlsx')

# 劃分特征和目標(biāo)變量

X = df.drop(['total_UPDRS', 'motor_UPDRS'], axis=1)

y = df['total_UPDRS']

# 劃分訓(xùn)練集和測(cè)試集

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, 

                                                    random_state=42)

df.head()

加載數(shù)據(jù)，提取特征和目標(biāo)變量 total_UPDRS，并將數(shù)據(jù)集按 80% 訓(xùn)練集和 20% 測(cè)試集的比例進(jìn)行劃分，以便后續(xù)模型訓(xùn)練和測(cè)試，數(shù)據(jù)集包含來(lái)自 42 位帕金森癥患者的記錄，通過(guò) 16 項(xiàng)聲音測(cè)量特征來(lái)監(jiān)測(cè)病情，特征包括抖動(dòng)（Jitter）、振幅顫動(dòng)（Shimmer）、噪聲與諧波比（NHR）等聲音特征，目標(biāo)變量是帕金森病評(píng)估標(biāo)準(zhǔn)（UPDRS，Unified Parkinson’s Disease Rating Scale），分為總評(píng)分（total_UPDRS）和運(yùn)動(dòng)評(píng)分（motor_UPDRS），該數(shù)據(jù)集主要用于預(yù)測(cè)患者的 UPDRS 評(píng)分，以幫助評(píng)估帕金森癥的嚴(yán)重程度，如需獲取數(shù)據(jù)集進(jìn)行復(fù)現(xiàn)操作，您可以通過(guò)添加作者微信聯(lián)系獲取

模型構(gòu)建及評(píng)價(jià)

RF模型訓(xùn)練

from sklearn.ensemble import RandomForestRegressor



# 創(chuàng)建隨機(jī)森林回歸器實(shí)例，并設(shè)置參數(shù)

rf_regressor = RandomForestRegressor(

    n_estimators=100,         # 'n_estimators'是森林中樹的數(shù)量。默認(rèn)是100，可以根據(jù)需要調(diào)整。

    criterion='squared_error', # 'criterion'參數(shù)指定用于拆分的質(zhì)量指標(biāo)。'squared_error'（默認(rèn)）表示使用均方誤差，另一選項(xiàng)是'absolute_error'。

    max_depth=7,           # 'max_depth'限制每棵樹的最大深度。'None'表示不限制深度。

    min_samples_split=2,      # 'min_samples_split'指定一個(gè)節(jié)點(diǎn)分裂所需的最小樣本數(shù)。默認(rèn)是2。

    min_samples_leaf=1,       # 'min_samples_leaf'指定葉子節(jié)點(diǎn)所需的最小樣本數(shù)。默認(rèn)是1。

    min_weight_fraction_leaf=0.0, # 'min_weight_fraction_leaf'與'min_samples_leaf'類似，但基于總樣本權(quán)重。默認(rèn)是0.0。

    random_state=42,          # 'random_state'控制隨機(jī)數(shù)生成，以便結(jié)果可復(fù)現(xiàn)。42是一個(gè)常用的隨機(jī)種子。

    max_leaf_nodes=None,      # 'max_leaf_nodes'限制每棵樹的最大葉子節(jié)點(diǎn)數(shù)。'None'表示不限制。

    min_impurity_decrease=0.0 # 'min_impurity_decrease'在分裂節(jié)點(diǎn)時(shí)要求的最小不純度減少量。默認(rèn)是0.0。

)



# 訓(xùn)練模型

rf_regressor.fit(X_train, y_train)

代碼創(chuàng)建并訓(xùn)練一個(gè)具有指定參數(shù)的隨機(jī)森林回歸模型，以預(yù)測(cè) total_UPDRS

模型評(píng)價(jià)指標(biāo)可視化

from sklearn import metrics

# 預(yù)測(cè)

y_pred_train =rf_regressor.predict(X_train)

y_pred_test = rf_regressor.predict(X_test)



y_pred_train_list = y_pred_train.tolist()

y_pred_test_list = y_pred_test.tolist()



# 計(jì)算訓(xùn)練集的指標(biāo)

mse_train = metrics.mean_squared_error(y_train, y_pred_train_list)

rmse_train = np.sqrt(mse_train)

mae_train = metrics.mean_absolute_error(y_train, y_pred_train_list)

r2_train = metrics.r2_score(y_train, y_pred_train_list)



# 計(jì)算測(cè)試集的指標(biāo)

mse_test = metrics.mean_squared_error(y_test, y_pred_test_list)

rmse_test = np.sqrt(mse_test)

mae_test = metrics.mean_absolute_error(y_test, y_pred_test_list)

r2_test = metrics.r2_score(y_test, y_pred_test_list)



# 將指標(biāo)放入列表

metrics_labels = ['MSE', 'RMSE', 'MAE', 'R-squared']

train_metrics = [mse_train, rmse_train, mae_train, r2_train]

test_metrics = [mse_test, rmse_test, mae_test, r2_test]



# 創(chuàng)建柱狀圖

x = np.arange(len(metrics_labels))  # 橫坐標(biāo)位置

width = 0.35  # 柱子的寬度



fig, ax = plt.subplots()



# 訓(xùn)練集和測(cè)試集的柱子

bars1 = ax.bar(x - width/2, train_metrics, width, label='Train')

bars2 = ax.bar(x + width/2, test_metrics, width, label='Test')



# 添加標(biāo)簽和標(biāo)題

ax.set_ylabel('Scores')

ax.set_title('Comparison of Train and Test Set Metrics')

ax.set_xticks(x)

ax.set_xticklabels(metrics_labels)

ax.legend()



# 在每個(gè)柱子上顯示數(shù)值

def autolabel(bars):

    """在每個(gè)柱子上顯示數(shù)值."""

    for bar in bars:

        height = bar.get_height()

        ax.annotate('{}'.format(round(height, 3)),

                    xy=(bar.get_x() + bar.get_width() / 2, height),

                    xytext=(0, 3),  # 3 點(diǎn)垂直偏移

                    textcoords="offset points",

                    ha='center', va='bottom')



autolabel(bars1)

autolabel(bars2)



fig.tight_layout()

plt.savefig("Comparison of Train and Test Set Metrics.pdf", format='pdf',bbox_inches='tight')

plt.show()

代碼通過(guò)計(jì)算和比較模型在訓(xùn)練集和測(cè)試集上的誤差和擬合優(yōu)度指標(biāo)（MSE、RMSE、MAE、R2），并使用柱狀圖可視化兩者的表現(xiàn)，幫助評(píng)估模型的性能是否存在過(guò)擬合或欠擬合的情況，從這些指標(biāo)可以看出，模型在訓(xùn)練集和測(cè)試集上的表現(xiàn)較為接近，說(shuō)明模型沒(méi)有嚴(yán)重的過(guò)擬合或欠擬合現(xiàn)象，雖然測(cè)試集上的誤差略高于訓(xùn)練集，但差異并不大，表明模型具有較好的泛化能力

模型預(yù)測(cè)可視化

import seaborn as sns

# 創(chuàng)建一個(gè)包含訓(xùn)練集和測(cè)試集真實(shí)值與預(yù)測(cè)值的數(shù)據(jù)框

data_train = pd.DataFrame({

    'True': y_train,

    'Predicted': y_pred_train,

    'Data Set': 'Train'

})



data_test = pd.DataFrame({

    'True': y_test,

    'Predicted': y_pred_test,

    'Data Set': 'Test'

})



data = pd.concat([data_train, data_test])



# 自定義調(diào)色板

palette = {'Train': '#b4d4e1', 'Test': '#f4ba8a'}



# 創(chuàng)建 JointGrid 對(duì)象

plt.figure(figsize=(8, 6), dpi=1200)

g = sns.JointGrid(data=data, x="True", y="Predicted", hue="Data Set", height=10, palette=palette)



# 繪制中心的散點(diǎn)圖

g.plot_joint(sns.scatterplot, alpha=0.5)

# 添加訓(xùn)練集的回歸線

sns.regplot(data=data_train, x="True", y="Predicted", scatter=False, ax=g.ax_joint, color='#b4d4e1', label='Train Regression Line')

# 添加測(cè)試集的回歸線

sns.regplot(data=data_test, x="True", y="Predicted", scatter=False, ax=g.ax_joint, color='#f4ba8a', label='Test Regression Line')

# 添加邊緣的柱狀圖

g.plot_marginals(sns.histplot, kde=False, element='bars', multiple='stack', alpha=0.5)



# 添加擬合優(yōu)度文本在右下角

ax = g.ax_joint

ax.text(0.95, 0.1, f'Train $R^2$ = {r2_train:.3f}', transform=ax.transAxes, fontsize=12,

        verticalalignment='bottom', horizontalalignment='right', bbox=dict(boxstyle="round,pad=0.3", edgecolor="black", facecolor="white"))

ax.text(0.95, 0.05, f'Test $R^2$ = {r2_test:.3f}', transform=ax.transAxes, fontsize=12,

        verticalalignment='bottom', horizontalalignment='right', bbox=dict(boxstyle="round,pad=0.3", edgecolor="black", facecolor="white"))

# 在左上角添加模型名稱文本

ax.text(0.75, 0.99, 'Model = RF', transform=ax.transAxes, fontsize=12,

        verticalalignment='top', horizontalalignment='left', bbox=dict(boxstyle="round,pad=0.3", edgecolor="black", facecolor="white"))



# 添加中心線

ax.plot([data['True'].min(), data['True'].max()], [data['True'].min(), data['True'].max()], c="black", alpha=0.5, linestyle='--', label='x=y')

ax.legend()

plt.savefig("TrueFalse.pdf", format='pdf', bbox_inches='tight')

plt.show()

代碼通過(guò)散點(diǎn)圖、回歸線、直方圖和擬合優(yōu)度（R2）值的可視化方式，直觀展示模型在訓(xùn)練集和測(cè)試集上的預(yù)測(cè)表現(xiàn)，對(duì)角線 x=y 表示理想狀態(tài)下的預(yù)測(cè)，散點(diǎn)的偏離程度和回歸線的擬合情況則表明了模型的實(shí)際預(yù)測(cè)能力，通過(guò)這些圖表，可以很好地評(píng)估模型的準(zhǔn)確性和泛化能力，這部分代碼具體參考文章——用圖表說(shuō)話：如何有效呈現(xiàn)回歸預(yù)測(cè)模型結(jié)果

shap原始特征貢獻(xiàn)可視化

import shap

# 構(gòu)建 shap解釋器

explainer = shap.TreeExplainer(rf_regressor)

# 計(jì)算測(cè)試集的shap值

shap_values = explainer.shap_values(X_test)

# 特征標(biāo)簽

labels = X_test.columns

# 繪制SHAP值總結(jié)圖（Summary Plot）

plt.figure(figsize=(15, 5))

shap.summary_plot(shap_values, X_test, plot_type="bar", show=False)

plt.title("SHAP_Feature_Importance_Raw_Output")

plt.savefig("SHAP_Feature_Importance_Raw_Output.pdf", format='pdf',bbox_inches='tight')

plt.show()

這段代碼通過(guò) SHAP 庫(kù)內(nèi)置的函數(shù)計(jì)算模型在測(cè)試集上每個(gè)特征的 SHAP 值，并自動(dòng)生成一個(gè)條形圖，總結(jié)各個(gè)特征對(duì)模型預(yù)測(cè)的重要性，條形圖是由 SHAP 庫(kù)的 shap.summary_plot() 函數(shù)生成的，它能夠直觀地展示哪些特征在模型預(yù)測(cè)中最為關(guān)鍵，從而提供全局層面的模型可解釋性，這意味著用戶無(wú)需手動(dòng)繪制圖表，直接利用 SHAP 的內(nèi)置函數(shù)即可快速得到結(jié)果，但是它并不支持直接生成文獻(xiàn)一樣的特征貢獻(xiàn)圖，而是需要我們根據(jù)原理去進(jìn)行圖表繪制

數(shù)據(jù)整理

# 計(jì)算每個(gè)特征的貢獻(xiàn)度

feature_contributions = np.abs(shap_values).mean(axis=0)



# 創(chuàng)建一個(gè)DataFrame，其中一列是特征名，另一列是特征貢獻(xiàn)度

contribution_df = pd.DataFrame({

    'Feature': labels,

    'Contribution': feature_contributions

})



# 創(chuàng)建類別規(guī)則

categories = ['Basic Info', 'Jitter', 'Shimmer', 'Noise', 'Non-linear']



# 特征對(duì)應(yīng)的類別

category_map = {

    'age': 'Basic Info',

    'sex': 'Basic Info',

    'test_time': 'Basic Info',

    'Jitter(%)': 'Jitter',

    'Jitter(Abs)': 'Jitter',

    'Jitter:RAP': 'Jitter',

    'Jitter:PPQ5': 'Jitter',

    'Jitter:DDP': 'Jitter',

    'Shimmer': 'Shimmer',

    'Shimmer(dB)': 'Shimmer',

    'Shimmer:APQ3': 'Shimmer',

    'Shimmer:APQ5': 'Shimmer',

    'Shimmer:APQ11': 'Shimmer',

    'Shimmer:DDA': 'Shimmer',

    'NHR': 'Noise',

    'HNR': 'Noise',

    'RPDE': 'Non-linear',

    'DFA': 'Non-linear',

    'PPE': 'Non-linear'

}



# 將類別映射到DataFrame

contribution_df_sorted['Category'] = contribution_df_sorted['Feature'].map(category_map)

contribution_df_sorted

這里代碼是計(jì)算每個(gè)特征對(duì)模型預(yù)測(cè)的平均貢獻(xiàn)度，并根據(jù)特征類別對(duì)其進(jìn)行分類，具體解釋如下，特征的重要程度計(jì)算原理——shap樣本值取絕對(duì)值的平均值從而得到每個(gè)特征的重要程度，然后原始數(shù)據(jù)對(duì)于每個(gè)特征是沒(méi)有具體的類別劃分的，這里我們根據(jù)特征的實(shí)際含義進(jìn)行劃分：將帕金森癥數(shù)據(jù)集的特征按類型劃分為五類：基本信息（如年齡、性別等）、抖動(dòng)特征（Jitter）、振幅抖動(dòng)特征（Shimmer）、噪聲特征（Noise）以及非線性特征（Non-linear）方便后續(xù)的文獻(xiàn)復(fù)現(xiàn)工作

文獻(xiàn)復(fù)現(xiàn)

環(huán)形圖繪制

# 按類別和貢獻(xiàn)度對(duì)數(shù)據(jù)進(jìn)行排序，確保同一類別的特征在一起，貢獻(xiàn)度從高到低排列

contribution_df_sorted = contribution_df_sorted.sort_values(by=['Category', 'Contribution'], ascending=[True, False])

# 創(chuàng)建一個(gè)用于生成顏色漸變的函數(shù)

def get_color_gradient(base_color, num_shades):

    # 生成從淺到深的顏色漸變

    gradient = np.linspace(0.4, 1, num_shades)  # 生成從較淺（0.4）到原色（1）的漸變

    return [(base_color[0], base_color[1], base_color[2], shade) for shade in gradient]



# 為五個(gè)類別定義顏色

category_colors = {

    'Basic Info': (0.9, 0.7, 0.2, 1),  # 黃色

    'Jitter': (0.6, 0.3, 0.9, 1),     # 紫色

    'Noise': (0.7, 0.3, 0.3, 1),      # 暗紅

    'Non-linear': (0.2, 0.9, 0.9, 1), # 青色

    'Shimmer': (0.3, 0.6, 0.9, 1),    # 淺藍(lán)

}



# 默認(rèn)顏色，如果類別未定義時(shí)使用

default_color = (0.8, 0.8, 0.8, 1)  # 灰色



# 獲取內(nèi)圈和外圈的貢獻(xiàn)度數(shù)據(jù)

inner_contribution = contribution_df_sorted.groupby('Category')['Contribution'].sum()

outer_contribution = contribution_df_sorted.set_index('Feature')['Contribution']



# 檢查是否有未定義的類別

undefined_categories = set(inner_contribution.index) - set(category_colors.keys())

if undefined_categories:

    print(f"Warning: 以下類別沒(méi)有定義顏色，將使用默認(rèn)顏色: {undefined_categories}")



# 為每個(gè)類別在外圈創(chuàng)建顏色漸變

outer_colors = []

for category in inner_contribution.index:

    # 選取當(dāng)前類別的數(shù)據(jù)

    category_df = contribution_df_sorted[contribution_df_sorted['Category'] == category]

    # 獲取類別的基礎(chǔ)顏色，如果沒(méi)有定義則使用默認(rèn)顏色

    base_color = category_colors.get(category, default_color)

    # 為當(dāng)前類別生成顏色漸變

    gradient_colors = get_color_gradient(base_color, len(category_df))

    outer_colors.extend(gradient_colors)



# 內(nèi)外圈的標(biāo)簽準(zhǔn)備

inner_labels = inner_contribution.index

outer_labels = outer_contribution.index



# 繪制同心餅圖

fig, ax = plt.subplots(figsize=(8, 8), dpi=1200)



# 繪制內(nèi)圈餅圖（類別級(jí)別的餅圖），顯示百分比，不顯示標(biāo)簽

ax.pie(inner_contribution, labels=['']*len(inner_contribution), autopct='%1.1f%%', radius=1, 

       colors=[category_colors.get(cat, default_color) for cat in inner_labels], wedgeprops=dict(width=0.3, edgecolor='w'))



# 繪制外圈餅圖（特征級(jí)別的餅圖），不顯示標(biāo)簽和百分比

ax.pie(outer_contribution, labels=['']*len(outer_labels), radius=0.7, colors=outer_colors, wedgeprops=dict(width=0.3, edgecolor='w'))



# 添加白色中心圓，形成環(huán)形圖

plt.gca().add_artist(plt.Circle((0, 0), 0.4, color='white'))



# 添加標(biāo)題

plt.title('Feature and Category Contribution by SHAP')

plt.savefig("Feature and Category Contribution by SHAP.pdf", format='pdf',bbox_inches='tight')

# 顯示圖表

plt.show()

首先按特征類別和貢獻(xiàn)度對(duì)數(shù)據(jù)進(jìn)行排序，然后根據(jù)每個(gè)類別和特征的貢獻(xiàn)度生成同心餅圖，其中外圈展示各類別的總貢獻(xiàn)度，內(nèi)圈展示各個(gè)特征的具體貢獻(xiàn)度，并通過(guò)顏色漸變區(qū)分類別內(nèi)特征的重要性，最終生成一個(gè)可視化特征貢獻(xiàn)的環(huán)形圖并保存為 PDF 文件，這里作者關(guān)閉標(biāo)簽顯示是為了一步一步演示，讀者可以顯示標(biāo)簽使得可視化更方便閱讀

條形圖繪制

# 按貢獻(xiàn)度從高到低排序

contribution_df_sorted = contribution_df_sorted.sort_values(by='Contribution', ascending=False)



# 準(zhǔn)備顏色列表

bar_colors = [category_colors.get(cat, (0.8, 0.8, 0.8, 1)) for cat in contribution_df_sorted['Category']]



# 繪制水平柱狀圖

fig, ax = plt.subplots(figsize=(10, 8), dpi=1200)



# 繪制條形圖

ax.barh(contribution_df_sorted['Feature'], contribution_df_sorted['Contribution'], color=bar_colors)



# 添加圖例

handles = [plt.Rectangle((0, 0), 1, 1, color=category_colors[cat]) for cat in category_colors]

labels = list(category_colors.keys())

ax.legend(handles, labels, loc='lower right')



# 設(shè)置標(biāo)簽和標(biāo)題

ax.set_xlabel('Contribution')

ax.set_ylabel('Feature')

ax.set_title('Feature Contributions by Category')



# 反轉(zhuǎn)y軸，以便貢獻(xiàn)度最大的特征在頂部

ax.invert_yaxis()

plt.savefig("Feature Contributions by Category.pdf", format='pdf',bbox_inches='tight')

# 顯示圖表

plt.show()

生成一個(gè)水平條形圖，按貢獻(xiàn)度從高到低顯示各個(gè)特征的貢獻(xiàn)，同時(shí)用不同顏色區(qū)分特征所屬的類別，并添加圖例，該圖表直觀地展示了哪些特征對(duì)模型的影響最大

環(huán)形圖和條形圖組合繪圖

from mpl_toolkits.axes_grid1.inset_locator import inset_axes



# 按類別和貢獻(xiàn)度對(duì)數(shù)據(jù)進(jìn)行排序，確保同一類別的特征在一起，貢獻(xiàn)度從高到低排列

contribution_df_sorted = contribution_df_sorted.sort_values(by=['Category', 'Contribution'], ascending=[True, False])



# 創(chuàng)建一個(gè)用于生成顏色漸變的函數(shù)

def get_color_gradient(base_color, num_shades):

    # 生成從淺到深的顏色漸變

    gradient = np.linspace(0.4, 1, num_shades)  # 生成從較淺（0.4）到原色（1）的漸變

    return [(base_color[0], base_color[1], base_color[2], shade) for shade in gradient]



# 為五個(gè)類別定義顏色

category_colors = {

    'Basic Info': (0.9, 0.7, 0.2, 1),  # 黃色

    'Jitter': (0.6, 0.3, 0.9, 1),     # 紫色

    'Noise': (0.7, 0.3, 0.3, 1),      # 暗紅

    'Non-linear': (0.2, 0.9, 0.9, 1), # 青色

    'Shimmer': (0.3, 0.6, 0.9, 1),    # 淺藍(lán)

}



# 默認(rèn)顏色，如果類別未定義時(shí)使用

default_color = (0.8, 0.8, 0.8, 1)  # 灰色



# 獲取內(nèi)圈和外圈的貢獻(xiàn)度數(shù)據(jù)

inner_contribution = contribution_df_sorted.groupby('Category')['Contribution'].sum()

outer_contribution = contribution_df_sorted.set_index('Feature')['Contribution']



# 檢查是否有未定義的類別

undefined_categories = set(inner_contribution.index) - set(category_colors.keys())

if undefined_categories:

    print(f"Warning: 以下類別沒(méi)有定義顏色，將使用默認(rèn)顏色: {undefined_categories}")



# 為每個(gè)類別在外圈創(chuàng)建顏色漸變

outer_colors = []

for category in inner_contribution.index:

    # 選取當(dāng)前類別的數(shù)據(jù)

    category_df = contribution_df_sorted[contribution_df_sorted['Category'] == category]

    # 獲取類別的基礎(chǔ)顏色，如果沒(méi)有定義則使用默認(rèn)顏色

    base_color = category_colors.get(category, default_color)

    # 為當(dāng)前類別生成顏色漸變

    gradient_colors = get_color_gradient(base_color, len(category_df))

    outer_colors.extend(gradient_colors)



# 內(nèi)外圈的標(biāo)簽準(zhǔn)備

inner_labels = inner_contribution.index

outer_labels = outer_contribution.index



# 創(chuàng)建圖形和子圖

fig, ax = plt.subplots(figsize=(10, 8), dpi=1200)

# 設(shè)置背景顏色為淡灰色

ax.set_facecolor('#f0f0f0')

# 添加網(wǎng)格線，設(shè)置網(wǎng)格線樣式

ax.grid(True, which='both', linestyle='--', linewidth=0.7, color='gray', alpha=0.7)

# ---- 繪制柱狀圖 ----

# 按貢獻(xiàn)度從高到低排序

contribution_df_sorted = contribution_df_sorted.sort_values(by='Contribution', ascending=False)



# 準(zhǔn)備顏色列表

bar_colors = [category_colors.get(cat, (0.8, 0.8, 0.8, 1)) for cat in contribution_df_sorted['Category']]



# 繪制條形圖

ax.barh(contribution_df_sorted['Feature'], contribution_df_sorted['Contribution'], color=bar_colors)



# 添加圖例

handles = [plt.Rectangle((0, 0), 1, 1, color=category_colors[cat]) for cat in category_colors]

labels = list(category_colors.keys())

ax.legend(handles, labels, loc='lower right')



# 設(shè)置標(biāo)簽和標(biāo)題

ax.set_xlabel('Contribution')

ax.set_ylabel('Feature')

ax.set_title('Feature Contributions by Category')



# 反轉(zhuǎn)y軸，以便貢獻(xiàn)度最大的特征在頂部

ax.invert_yaxis()



# ---- 在柱狀圖中嵌入同心餅圖 ----

# 創(chuàng)建嵌入的軸

inset_ax = inset_axes(ax, width=2, height=2, loc='upper right', bbox_to_anchor=(0.8, 0.35, 0.2, 0.2), bbox_transform=ax.transAxes)



# 繪制內(nèi)圈餅圖（類別級(jí)別的餅圖），顯示百分比，不顯示標(biāo)簽

inset_ax.pie(inner_contribution, labels=['']*len(inner_contribution), autopct='%1.1f%%', radius=1, 

       colors=[category_colors.get(cat, default_color) for cat in inner_labels], wedgeprops=dict(width=0.3, edgecolor='w'))



# 繪制外圈餅圖（特征級(jí)別的餅圖），不顯示標(biāo)簽和百分比

inset_ax.pie(outer_contribution, labels=['']*len(outer_labels), radius=0.7, colors=outer_colors, wedgeprops=dict(width=0.3, edgecolor='w'))



# 添加白色中心圓，形成環(huán)形圖

inset_ax.add_artist(plt.Circle((0, 0), 0.4, color='white'))



plt.savefig("Combined_Feature_Contributions_and_Circular_Chart.pdf", format='pdf',bbox_inches='tight')

# 顯示圖表

plt.show()

通過(guò)組合柱狀圖和同心餅圖的方式，直觀展示各個(gè)特征在模型中的貢獻(xiàn)度，以及各個(gè)類別對(duì)模型的整體貢獻(xiàn)。柱狀圖顯示特征的詳細(xì)貢獻(xiàn)，餅圖則提供了類別層面的總體貢獻(xiàn)展示，使得用戶能夠從全局和細(xì)節(jié)兩方面理解特征的重要性，這里讀者其實(shí)是不需要單獨(dú)去繪制環(huán)形圖和條形圖的作者只是為了讓讀者更方便理解，其次這里的環(huán)形圖和文獻(xiàn)的環(huán)形圖剛好是相反的，作者這里外圈是特征類別總貢獻(xiàn)度，內(nèi)圈是各個(gè)特征具體的貢獻(xiàn)度映射，對(duì)于模型解讀并沒(méi)有區(qū)別

可視化解讀：年齡（age）作為“基本信息”類別的特征貢獻(xiàn)度最高，DFA作為“非線性”特征也具有顯著貢獻(xiàn)，外圈餅圖顯示“基本信息”類別占總貢獻(xiàn)的 71.5%，而“非線性”特征占 20%，其余類別如“噪聲”、“抖動(dòng)”和“振幅”特征的貢獻(xiàn)較小，內(nèi)圈的特征貢獻(xiàn)進(jìn)一步細(xì)分了各類別中具體特征的貢獻(xiàn)度，幫助直觀理解特征對(duì)模型預(yù)測(cè)的重要性。需要注意的是，這個(gè)結(jié)果基于演示用的復(fù)現(xiàn)數(shù)據(jù)，對(duì)于實(shí)際生活中的情況并不一定具有直接的參考價(jià)值，如果讀者在理解代碼時(shí)遇到問(wèn)題，可以聯(lián)系作者并通過(guò) ChatGPT 協(xié)助，結(jié)合 AI 更好地理解文章內(nèi)容

文章轉(zhuǎn)自微信公眾號(hào)@Python機(jī)器學(xué)習(xí)AI

使用GeoJSON數(shù)據(jù)進(jìn)行SHAP值地圖可視化解釋ML模型

我們有何不同？

API服務(wù)商零注冊(cè)

多API并行試用

數(shù)據(jù)驅(qū)動(dòng)選型，提升決策效率

查看全部API→

#AI文本生成大模型API

對(duì)比大模型API的內(nèi)容創(chuàng)意新穎性、情感共鳴力、商業(yè)轉(zhuǎn)化潛力

25個(gè)渠道

一鍵對(duì)比試用API 限時(shí)免費(fèi)

#AI深度推理大模型API

對(duì)比大模型API的邏輯推理準(zhǔn)確性、分析深度、可視化建議合理性

10個(gè)渠道

一鍵對(duì)比試用API 限時(shí)免費(fèi)

_{<em id="5ra8z"></em>}

目標(biāo)與方法

代碼實(shí)現(xiàn)

數(shù)據(jù)讀取

模型構(gòu)建及評(píng)價(jià)

shap原始特征貢獻(xiàn)可視化

文獻(xiàn)復(fù)現(xiàn)

環(huán)形圖繪制

條形圖繪制

環(huán)形圖和條形圖組合繪圖

使用GeoJSON數(shù)據(jù)進(jìn)行SHAP值地圖可視化解釋ML模型

樹模型系列：如何通過(guò)XGBoost提取特征貢獻(xiàn)度

我們有何不同？

熱門場(chǎng)景實(shí)測(cè)，選對(duì)API

#AI文本生成大模型API

#AI深度推理大模型API

目標(biāo)與方法

代碼實(shí)現(xiàn)

數(shù)據(jù)讀取

模型構(gòu)建及評(píng)價(jià)

shap原始特征貢獻(xiàn)可視化

文獻(xiàn)復(fù)現(xiàn)

環(huán)形圖繪制

條形圖繪制

環(huán)形圖和條形圖組合繪圖

使用GeoJSON數(shù)據(jù)進(jìn)行SHAP值地圖可視化解釋ML模型

樹模型系列：如何通過(guò)XGBoost提取特征貢獻(xiàn)度

我們有何不同？

熱門場(chǎng)景實(shí)測(cè)，選對(duì)API

#AI文本生成大模型API

#AI深度推理大模型API

我們有何不同？

熱門場(chǎng)景實(shí)測(cè)，選對(duì)API