您不必使用tensorflow或keras来划分数据集。如果您已安装sklearn软件包,则只需使用它即可:
from sklearn.model_selection import train_test_split
X = ...
Y = ...
x_train,x_test,y_train,y_test = train_test_split(X,Y,test_size=0.2)
您也可以将numpy用于相同目的:
import numpy
X = ...
Y = ...
test_size = 0.2
train_nsamples = (1-test_size) * len(Y)
x_train,y_test = X[:train_nsamples,:],X[train_nsamples:,Y[:train_nsamples,],Y[train_nsamples:,]
在Keras中:
from keras.datasets import mnist
import numpy as np
from sklearn.model_selection import train_test_split
(x_train,y_train),(x_test,y_test) = mnist.load_data()
x = np.concatenate((x_train,x_test))
y = np.concatenate((y_train,y_test))
train_size = 0.7
x_train,y_test = train_test_split(x,y,train_size=train_size)
,
经过反复试验和奋斗一天,我找到了解决方法。
第一路
import glob
horse = glob.glob('full_dataset/horse/*.*')
donkey = glob.glob('full_dataset/donkey/*.*')
cow = glob.glob('full_dataset/cow/*.*')
zebra = glob.glob('full_dataset/zebra/*.*')
data = []
labels = []
for i in horse:
image=tf.keras.preprocessing.image.load_img(i,color_mode='RGB',target_size= (280,280))
image=np.array(image)
data.append(image)
labels.append(0)
for i in donkey:
image=tf.keras.preprocessing.image.load_img(i,280))
image=np.array(image)
data.append(image)
labels.append(1)
for i in cow:
image=tf.keras.preprocessing.image.load_img(i,280))
image=np.array(image)
data.append(image)
labels.append(2)
for i in zebra:
image=tf.keras.preprocessing.image.load_img(i,280))
image=np.array(image)
data.append(image)
labels.append(3)
data = np.array(data)
labels = np.array(labels)
from sklearn.model_selection import train_test_split
X_train,X_test,ytrain,ytest = train_test_split(data,labels,test_size=0.2,random_state=42)
第二种方式
image_generator = ImageDataGenerator(rescale=1/255,validation_split=0.2)
train_dataset = image_generator.flow_from_directory(batch_size=32,directory='full_dataset',shuffle=True,target_size=(280,280),subset="training",class_mode='categorical')
validation_dataset = image_generator.flow_from_directory(batch_size=32,subset="validation",class_mode='categorical')
第二种方法的主要缺点,不能用于显示图片。如果您写validation_dataset[1]
,它将出错。但是,如果我使用第一种方法,它会起作用:X_test[1]
本文链接:https://www.f2er.com/2597873.html