一、二折交叉验证
import numpy as np
from sklearn.model_selection import kfold
x = np.array([[1, 2], [3, 4], [1, 2], [3, 4]])
#y = np.array([1, 2, 3, 4])
kf = kfold(n_splits=2)
#2折交叉验证,将数据分为两份即前后对半分,每次取一份作为test集
for train_index, test_index in kf.split(x):
print('train_index', train_index, 'test_index', test_index)
#train_index与test_index为下标
train_x = x[train_index]
test_x= x[test_index]
print("train_x",train_x)
print("test_x",test_x)
实验结果
说明:因为是二折交叉验证,将数据集分为两个小块
d1与d2分别作为训练集和测试集
实验结果
train_index [2 3] test_index [0 1]
train_index [0 1] test_index [2 3]
train_x [[1 2]
[3 4]]
test_x [[1 2]
[3 4]]
二、三折
y = np.array([[1, 2], [3, 4], [5, 6], [7, 8],[9,10],[11,12]])
#y = np.array([1, 2, 3, 4])
i=0
kf = kfold(n_splits=3)
#2折交叉验证,将数据分为两份即前后对半分,每次取一份作为test集
for train_index, test_index in kf.split(y):
i=i 1
print(i)
print('train_index', train_index, 'test_index', test_index)
#train_index与test_index为下标
train_y = y[train_index]
test_y= y[test_index]
print("train_y",train_y)
print("test_y",test_y)
说明:三折交叉验证将整个数据集分为三份
实验结果:
#第一次d2、d3作为训练集,d1作为测试集
train_index [2 3 4 5] test_index [0 1]
train_y [[ 5 6]
[ 7 8]
[ 9 10]
[11 12]]
test_y [[1 2]
[3 4]]
#第二次d1、d3作为训练集,d2作为测试集
train_index [0 1 4 5] test_index [2 3]
train_y [[ 1 2]
[ 3 4]
[ 9 10]
[11 12]]
test_y [[5 6]
[7 8]]
#第一次d1、d2作为训练集,d3作为测试集
train_index [0 1 2 3] test_index [4 5]
train_y [[1 2]
[3 4]
[5 6]
[7 8]]
test_y [[ 9 10]
[11 12]]