Get statistics for each group (such as count, mean, etc) using pandas GroupBy?
In pandas, you can use the groupby()
method to group data by one or more columns and then use the agg()
method to compute various statistics for each group.
For example, suppose you have a DataFrame called df
with columns 'A' and 'B' and you want to group the data by column 'A' and calculate the count, mean, and sum for column 'B' for each group:
import pandas as pd
df = pd.DataFrame({'A': ['foo', 'bar', 'baz', 'foo', 'bar', 'baz'],
'B': [1, 2, 3, 4, 5, 6]})
grouped = df.groupby('A')
result = grouped['B'].agg(['count', 'mean', 'sum'])
print(result)
This will output a DataFrame that shows the count, mean, and sum of column 'B' for each unique value of column 'A':
count mean sum A bar 2 3.500000 7 baz 2 4.500000 9 foo 2 2.500000 5
You can also use a dictionary to specify the statistics for each column:
import pandas as pd
df = pd.DataFrame({'A': ['foo', 'bar', 'baz', 'foo', 'bar', 'baz'],
'B': [1, 2, 3, 4, 5, 6]})
grouped = df.groupby('A')
result = grouped.agg({'B': ['count', 'mean', 'sum']})
print(result)