How to covert a DataFrame column containing strings and NaN
values to floats. And there is another column whose values are strings and floats; how to convert this entire column to floats.
如何将包含字符串和NaN值的DataFrame列转换为浮点数。还有另一列,其值为字符串和浮点数;如何将整个列转换为浮点数。
53
NOTE:
pd.convert_objects
has now been deprecated. You should usepd.Series.astype(float)
orpd.to_numeric
as described in other answers.注意:pd.convert_objects现已弃用。您应该使用pd.Series.astype(float)或pd.to_numeric,如其他答案中所述。
This is available in 0.11. Forces conversion (or set's to nan) This will work even when astype
will fail; its also series by series so it won't convert say a complete string column
这是0.11。强制转换(或设置为nan)即使astype失败也会有效;它也是系列的系列所以它不会转换说完整的字符串列
In [10]: df = DataFrame(dict(A = Series(['1.0','1']), B = Series(['1.0','foo'])))
In [11]: df
Out[11]:
A B
0 1.0 1.0
1 1 foo
In [12]: df.dtypes
Out[12]:
A object
B object
dtype: object
In [13]: df.convert_objects(convert_numeric=True)
Out[13]:
A B
0 1 1
1 1 NaN
In [14]: df.convert_objects(convert_numeric=True).dtypes
Out[14]:
A float64
B float64
dtype: object
37
You can try df.column_name = df.column_name.astype(float)
. As for the NaN
values, you need to specify how they should be converted, but you can use the .fillna
method to do it.
您可以尝试df.column_name = df.column_name.astype(float)。至于NaN值,您需要指定它们的转换方式,但您可以使用.fillna方法来完成它。
Example:
例:
In [12]: df
Out[12]:
a b
0 0.1 0.2
1 NaN 0.3
2 0.4 0.5
In [13]: df.a.values
Out[13]: array(['0.1', nan, '0.4'], dtype=object)
In [14]: df.a = df.a.astype(float).fillna(0.0)
In [15]: df
Out[15]:
a b
0 0.1 0.2
1 0.0 0.3
2 0.4 0.5
In [16]: df.a.values
Out[16]: array([ 0.1, 0. , 0.4])
31
In a newer version of pandas (0.17 and up), you can use to_numeric function. It allows you to convert the whole dataframe or just individual columns. It also gives you an ability to select how to treat stuff that can't be converted to numeric values:
在较新版本的pandas(0.17及更高版本)中,您可以使用to_numeric函数。它允许您转换整个数据帧或仅转换单个列。它还使您能够选择如何处理无法转换为数值的内容:
import pandas as pd
s = pd.Series(['1.0', '2', -3])
pd.to_numeric(s)
s = pd.Series(['apple', '1.0', '2', -3])
pd.to_numeric(s, errors='ignore')
pd.to_numeric(s, errors='coerce')
20
df['MyColumnName'] = df['MyColumnName'].astype('float64')
本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.silva-art.net/blog/2013/05/24/3b09b9d5a89f0c478c9b23338383e9d1.html。