Numpy：从CSV中读取数字，数字为字符串

[英]Numpy : read data from CSV having numerals as string

本文翻译自 maximus 查看原文 2015/12/16 313 python/ csv/ numpy

I'm reading a .csv file in python using command as:

我正在使用命令在python中读取.csv文件:

data = np.genfromtxt('home_data.csv', dtype=float, delimiter=',', names=True)

this csv has one column with zipcode which are numerals but in string format, for eg "85281". This column has values as nan:

这个csv有一个带有zipcode的列,它是数字但是字符串格式,例如“85281”。此列的值为nan:

data['zipcode']
Output : array([ nan,  nan,  nan, ...,  nan,  nan,  nan])

How can I convert these values in string to integers so as to get an array of values and not of 'nan's.

如何将字符串中的这些值转换为整数,以便得到一个值数组而不是'nan'。

2 个解决方案

#1

you must help genfromtxt a little :

你必须帮助genfromtxt一点:

 data = np.genfromtxt('home_data.csv',
 dtype=[int,float],delimiter=',',names=True,
 converters={0: lambda b:(b.decode().strip('"'))})

each field is collected as bytes. float(b'1\n') return 1.0 , but float(b'"8210"') give an error. the converters option allow to define for each field (here field 0) a function to do the proper conversion, here converting in string(decode) and removing (strip) the trailing ".

每个字段都以字节形式收集。 float(b'1 \ n')返回1.0,但float(b'“8210”')给出错误。转换器选项允许为每个字段(此处为字段0)定义一个执行正确转换的函数,此处转换为字符串(解码)和删除(剥离)尾随“。

If home_data.csv is :

如果home_data.csv是:

zipcode,val
"8210",1
"8320",2
"14",3

you will obtain :

你会得到:

data -> array([(8210, 1.0), (8320, 2.0), (14, 3.0)], dtype=[('zipcode', '<i4'), ('val', '<f8')])
data['zipcode'] -> array([8210, 8320,   14])

#2

Maybe not the most efficient solution, but read your data as string and convert it afterwards to float:

也许不是最有效的解决方案,但是将您的数据读取为字符串并将其转换为float:

data = np.genfromtxt('home_data.csv', dtype=float, delimiter=',', names=True)


zipcode = data['zipcode'].astype(np.float)

Btw., is there a reason you want to save a zipcode as a float?

顺便说一下,你有没有理由想把一个邮政编码保存为浮动?

注意！

本站翻译的文章，版权归属于本站，未经许可禁止转摘，转摘请注明本文地址：http://www.silva-art.net/blog/2015/12/16/d35b074cf5d41fa63f5c9ac9d797821.html。

猜您在找

如何在matlab中读取csv中的混合字符串和数字数据并进行操作 - How to read mixed string and number data from csv in matlab and manipulate 如何从字符串中读取多位数字 - How to read multiple digit number from a string 如何在字符串数组中读取* .csv文件 - how read *.csv file in string array 有使用阅读的方法吗?从字符串值而不是R中的文件读取csv ? - Is there a way to use read.csv to read from a string value rather than a file in R? 来自字符串或包数据的pandas.read_csv - pandas.read_csv from string or package data