python sklearn roc_curve用法及代码示例

合集下载

1、下载文档前请自行甄别文档内容的完整性，平台不提供额外的编辑、内容补充、找答案等附加服务。
2、"仅部分预览"的文档,不可在线预览部分如存在完整性等问题,可反馈申请退款(可完整预览的文档不适用该条件!)。
3、如文档侵犯您的权益，请联系客服反馈,我们会尽快为您处理(人工客服工作时间：9:00-18:30)。

ROC曲线（Receiver Operating Characteristic curve）是一种用于评估分类模型性能的工具，它通过绘制真正类率（True Positive Rate，TPR）与假正类率（False Positive Rate，FPR）之间的关系，来展示分类模型的性能。

在Python中，sklearn库提供了一个函数roc_curve用于计算ROC曲线。

以下是roc_curve的用法以及一个示例代码：
roc_curve用法：
python
from sklearn.metrics import roc_curve
# 假设 y_true 是真实的标签，y_scores 是模型预测的概率分数
y_true = [0, 0, 1, 1]
y_scores = [0.1, 0.4, 0.35, 0.8]
fpr, tpr, thresholds = roc_curve(y_true, y_scores)
代码示例：
python
from sklearn.metrics import roc_curve, auc
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
import matplotlib.pyplot as plt
# 创建一个二分类问题的模拟数据集
X, y = make_classification(n_samples=1000, n_features=20, n_informative=2,
n_redundant=10, random_state=42)
# 将数据集划分为训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2,
random_state=42)
# 使用逻辑回归作为分类器
classifier = LogisticRegression()
classifier.fit(X_train, y_train)
y_scores = classifier.predict_proba(X_test)[:, 1] # 获取预测为正类的概率分数
# 计算ROC曲线
fpr, tpr, thresholds = roc_curve(y_test, y_scores)
roc_auc = auc(fpr, tpr) # 计算AUC（Area Under the Curve）值
# 绘制ROC曲线
plt.figure()
lw = 2
plt.plot(fpr, tpr, color='darkorange', lw=lw, label='ROC curve (area = %0.2f)' % roc_auc)
plt.plot([0, 1], [0, 1], color='navy', lw=lw, linestyle='--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic Example')
plt.legend(loc="lower right")
plt.show()
在这个示例中，我们首先创建了一个模拟的二分类问题数据集，然后使用逻辑回归作为分类器进行训练和预测。

接着，我们使用roc_curve函数计算ROC曲线，并使用auc函数计算AUC值。

最后，我们使用matplotlib 库绘制ROC曲线图。