pandas数据框获得每组的第一行

我有一个像下面的pandasDataFrame 。

 df = pd.DataFrame({'id' : [1,1,1,2,2,3,3,3,3,4,4,5,6,6,6,7,7], 'value' : ["first","second","second","first", "second","first","third","fourth", "fifth","second","fifth","first", "first","second","third","fourth","fifth"]})

我想通过[“id”，“value”]将其分组，并得到每个组的第一行。

  id value 0 1 first 1 1 second 2 1 second 3 2 first 4 2 second 5 3 first 6 3 third 7 3 fourth 8 3 fifth 9 4 second 10 4 fifth 11 5 first 12 6 first 13 6 second 14 6 third 15 7 fourth 16 7 fifth

预期结果

  id value 1 first 2 first 3 first 4 second 5 first 6 first 7 fourth

我试过以下只给出了DataFrame的第一行。任何有关这个帮助表示赞赏。

 In [25]: for index, row in df.iterrows(): ....: df2 = pd.DataFrame(df.groupby(['id','value']).reset_index().ix[0])

 >>> df.groupby('id').first() value id 1 first 2 first 3 first 4 second 5 first 6 first 7 fourth

如果你需要id作为列：

 >>> df.groupby('id').first().reset_index() id value 0 1 first 1 2 first 2 3 first 3 4 second 4 5 first 5 6 first 6 7 fourth

要获得n个第一个logging，可以使用head（）：

 >>> df.groupby('id').head(2).reset_index(drop=True) id value 0 1 first 1 1 second 2 2 first 3 2 second 4 3 first 5 3 third 6 4 second 7 4 fifth 8 5 first 9 6 first 10 6 second 11 7 fourth 12 7 fifth

这将给你每个组的第二行（零索引，nth（0）是相同的第一个（））：

 df.groupby('id').nth(1)

文档： http : //pandas.pydata.org/pandas-docs/stable/groupby.html#taking-the-nth-row-of-each-group

也许这是你想要的

 import pandas as pd idx = pd.MultiIndex.from_product([['state1','state2'], ['county1','county2','county3','county4']]) df = pd.DataFrame({'pop': [12,15,65,42,78,67,55,31]}, index=idx)

  pop state1 county1 12 county2 15 county3 65 county4 42 state2 county1 78 county2 67 county3 55 county4 31

 df.groupby(level=0, group_keys=False).apply(lambda x: x.sort_values('pop', ascending=False)).groupby(level=0).head(3) > Out[29]: pop state1 county3 65 county4 42 county2 15 state2 county1 78 county2 67 county3 55

pandas数据框获得每组的第一行

将Pandas GroupBy对象转换为DataFrame

从pythonpandas的列名检索列索引

如何获得pandasDataFrame的第一列作为一个系列？

在Pythonpandas现有的DataFrame中添加新的列

在Pandas数据框中查找唯一值，而不考虑行或列的位置

Python / Pandas – 用于查看DataFrame或Matrix的GUI

重命名pandas列

如何通过密钥访问pandas群组数据框

从pandasDataFrame列标题获取列表

Python pandas从一列string的数据select中过滤掉nan