将带有图像列的Dask DataFrame保存到HDF5

我正在尝试将大小不同的图像加载到Dask DataFrame列中,并将数据帧保存为HDF5文件格式。

这是标准方法:

const server = new ApolloServer({
        schema,dataSources,context: ({ req }) => {
            let authToken = null;
            try {
                authToken = req.headers[AUTH_TOKEN];

                if (!authToken) {
                     console.error(`No ${AAUTH_TOKENU} header present.`)
                }
             } catch (e) {
                console.warn(`Unable to authenticate using auth token: ${authToken}`);
             }
             console.log(authToken)
             return {
                 db,authToken
             }
        } 
    });

我收到以下错误消息:

import glob

import dask.dataframe as dd
import pandas as pd
import numpy as np
from skimage.io import imread


dir = '/Users/petioptrv/Downloads/mask'
filenames = glob.glob(dir + '/*.png')[:5]

df = pd.DataFrame({"paths": filenames})
ddf = dd.from_pandas(df,npartitions=2)
ddf['images'] = ddf['paths'].apply(imread,meta=('images',np.uint8))
ddf.to_hdf('test.h5','/data')

基本上,PyTables会检测到该列具有... File "/Users/petioptrv/miniconda3/envs/dask/lib/python3.7/site-packages/pandas/io/pytables.py",line 2214,in set_atom_string item=item,type=inferred_type TypeError: Cannot serialize the column [images] because its data contents are [mixed] object dtype dtype,并检查其类型是否为object。不是,所以会引发异常。

我可能可以通过将图像打开为字节数组并将其转换为字符串来破解它,但这远非理想情况。

xz7474 回答:将带有图像列的Dask DataFrame保存到HDF5

尝试按照this问题中的建议指定data_columns

ddf.to_hdf('test.h5','/data',format = 'table',data_columns = ['images'])
本文链接:https://www.f2er.com/3162918.html

大家都在问