通过迭代列将pandas数据帧转换为系列

[英]turn pandas dataframe into series by iterating over columns


I have a dataframe, and I am trying to get a Series of the form:

我有一个数据框,我试图得到一个系列的形式:

      col1  col2  col3
col1   1.0  0.20  0.70
col2   0.2  1.00  0.01
col3   0.7  0.01  1.00

GOAL:

col1Xcol1 1.0
col1Xcol2 0.2
col1Xcol3 0.7
col2Xcol1 0.2
...

My code so far:

我的代码到目前为止:

pvals2=pd.DataFrame({'col1': [1, .2,.7], 
                     'col2': [.2, 1,.01],
                     'col3': [.7,.01,1]},
                    index = ['col1', 'col2', 'col3'])

print(pvals.transpose().join(pvals, how='outer',lsuffix='_left', rsuffix='_right'))

OUTPUT:

          vote_left ballot1_left ballot1_x_left vote_right ballot1_right  \
vote              0       0.0923         0.0521          0        0.0923   
ballot1      0.0923            0         0.8213     0.0923             0   
ballot1_x    0.0521       0.8213              0     0.0521        0.8213   

          ballot1_x_right  
vote               0.0521  
ballot1            0.8213  
ballot1_x               0  

4 个解决方案

#1


0  

First stack the dataframe

首先堆叠数据帧

st = pvals2.stack()

Create a new index, by adding together the multiindex

通过将多索引添加在一起来创建新索引

newdex = st.index._get_level_values(0) + 'X' + st.index._get_level_values(1)

Set newdex as the index for the series

将newdex设置为系列的索引

st.set_axis(0,newdex)

All together

st = pvals2.stack()
st.set_axis(0,st.index._get_level_values(0) + 'X' + st.index._get_level_values(1))

col1Xcol1    1.00
col1Xcol2    0.20
col1Xcol3    0.70
col2Xcol1    0.20
col2Xcol2    1.00
col2Xcol3    0.01
col3Xcol1    0.70
col3Xcol2    0.01
col3Xcol3    1.00

#2


0  

Consider melt with column assignment for new index then select the value column since a single pandas DataFrame column is a pandas Series:

考虑使用新索引的列分配进行融合,然后选择值列,因为单个pandas DataFrame列是一个pandas系列:

Data

from io import StringIO
import pandas as pd

txt = '''      col1  col2  col3
col1   1.0  0.20  0.70
col2   0.2  1.00  0.01
col3   0.7  0.01  1.00'''

df = pd.read_table(StringIO(txt), sep="\s+")

Series build

mdf = pd.melt(df.reset_index(), id_vars='index')
mdf['s'] = mdf['index'] + 'X' + mdf['variable']

new_series = mdf.set_index('s').rename_axis(None)['value']

print(new_series)
# col1Xcol1    1.00
# col2Xcol1    0.20
# col3Xcol1    0.70
# col1Xcol2    0.20
# col2Xcol2    1.00
# col3Xcol2    0.01
# col1Xcol3    0.70
# col2Xcol3    0.01
# col3Xcol3    1.00
# Name: value, dtype: float64

#3


0  

concat and setting the new index works:

concat和设置新索引的工作原理:

>>> ser = pd.concat([pvals2[col] for col in pvals2.columns])
>>> ser.index = [pvals2[col].name + 'X' + x for col in pvals2.columns 
                 for x in pvals2[col].index]
>>> ser
col1Xcol1    1.00
col1Xcol2    0.20
col1Xcol3    0.70
col2Xcol1    0.20
col2Xcol2    1.00
col2Xcol3    0.01
col3Xcol1    0.70
col3Xcol2    0.01
col3Xcol3    1.00
dtype: float64

#4


0  

The following code:

以下代码:

pvals = pd.DataFrame({'col1': [1, .2,.7], 
                      'col2': [.2, 1,.01],
                      'col3': [.7,.01,1]},
                     index = ['row1', 'row2', 'row3'])

values = []
ind = []
for i in range(len(pvals.index)):
    for col in pvals:
        row = pvals.index[i]
        values.append(pvals[col][row])
        ind.append("%sX%s" % (row, col))

newpvals = pd.Series(values, ind)

gives:

>>> newvals
row1Xcol1    1.00
row1Xcol2    0.20
row1Xcol3    0.70
row2Xcol1    0.20
row2Xcol2    1.00
row2Xcol3    0.01
row3Xcol1    0.70
row3Xcol2    0.01
row3Xcol3    1.00
dtype: float64

Edit: I misread, so changed into Series.

编辑:我误读了,所以变成了系列。

智能推荐

注意!

本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.silva-art.net/blog/2018/03/02/1f2ad2a6f671fe3d6bb66f59919c8041.html



 
© 2014-2019 ITdaan.com 粤ICP备14056181号  

赞助商广告