I have a dictionary with City names as keys and corresponding to each city there is a list of dates. For Example:
我有一个字典,城市名称作为键,对应每个城市有一个日期列表。例如:
{
'A':['2017-01-02','2017-01-03'],
'B':['2017-02-02','2017-02-03','2017-02-04','2017-02-05'],
'C':['2016-02-02']
}
And I want to convert this to the following dataframe with 2 columns.
我想将其转换为包含2列的以下数据帧。
City_Name Date
A 2017-01-02
A 2017-01-03
B 2017-02-02
B 2017-02-03
B 2017-02-04
B 2017-02-05
C 2016-02-02
3
Or we can using melt
或者我们可以使用融化
pd.DataFrame(dict([ (k,pd.Series(v)) for k,v in d.items() ])).melt().dropna()
Out[51]:
variable value
0 A 2017-01-02
1 A 2017-01-03
4 B 2017-02-02
5 B 2017-02-03
6 B 2017-02-04
7 B 2017-02-05
8 C 2016-02-02
A way inspired by piR
一种受piR启发的方式
pd.Series(d).apply(pd.Series).melt().dropna()
Out[142]:
variable value
0 0 2017-01-02
1 0 2017-02-02
2 0 2016-02-02
3 1 2017-01-03
4 1 2017-02-03
7 2 2017-02-04
10 3 2017-02-05
2
Use numpy.repeat
for repeat keys
:
使用numpy.repeat重复键:
#get lens of lists
a = [len(x) for x in d.values()]
#flattening values
b = [i for s in d.values() for i in s]
df = pd.DataFrame({'City_Name':np.repeat(list(d.keys()), a), 'Date':b})
print (df)
City_Name Date
0 C 2016-02-02
1 B 2017-02-02
2 B 2017-02-03
3 B 2017-02-04
4 B 2017-02-05
5 A 2017-01-02
6 A 2017-01-03
Another similar like Danh Pham' solution, credit to him:
另一个类似Danh Pham的解决方案,归功于他:
df = pd.DataFrame([(i, day) for i,j in d.items() for day in j],
columns=['City_Name','Date'])
print(df)
City_Name Date
0 C 2016-02-02
1 B 2017-02-02
2 B 2017-02-03
3 B 2017-02-04
4 B 2017-02-05
5 A 2017-01-02
6 A 2017-01-03
1
You can reprocess your data into list of tuple of name and date, ex: ('A', '2017-01-01')
before make the DataFrame
.
在创建DataFrame之前,您可以将数据重新处理为名称和日期元组的列表,例如:('A','2017-01-01')。
Try this:
尝试这个:
import pandas as pd
data = {
'A':['2017-01-02','2017-01-03'],
'B':['2017-02-02','2017-02-03','2017-02-04','2017-02-05'],
'C':['2016-02-02']
}
pd.DataFrame([(i[0], day) for i in data.items() for day in i[1]])
Output:
输出:
0 1
0 A 2017-01-02
1 A 2017-01-03
2 C 2016-02-02
3 B 2017-02-02
4 B 2017-02-03
5 B 2017-02-04
6 B 2017-02-05
1
You can use DataFrame.from_dict
(only if lists are all of the same lenght)
您可以使用DataFrame.from_dict(仅当列表具有相同的长度时)
import pandas as pd
将pandas导入为pd
import pandas as pd
将pandas导入为pd
d = {
'A':['2017-01-02','2017-01-03'],
'B':['2017-02-02','2017-02-03','2017-02-04','2017-02-05'],
'C':['2016-02-02']
}
df = pd.DataFrame.from_dict(d, orient='index').stack().reset_index()
df.columns = ["City_Name", "A", "Date"]
del df["A"]
print(df)
res:
RES:
City_Name Date
0 B 2017-02-02
1 B 2017-02-03
2 B 2017-02-04
3 B 2017-02-05
4 A 2017-01-02
5 A 2017-01-03
6 C 2016-02-02
本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:http://www.silva-art.net/blog/2017/11/09/fb47719d31746ff2f148e0936a6b9c04.html。