Pandas agg

Pandas agg

Pandas is a powerful Python library used for data manipulation and analysis. One of its core functionalities is the ability to perform aggregation operations on dataframes. The pandas agg function (short for aggregate) is particularly useful when you need to apply one or more operations over the specified axis of a DataFrame or a Series. This article will explore the pandas agg function in-depth, providing a comprehensive guide on its usage with numerous examples.

Introduction to the pandas agg Function

The pandas agg function in Pandas allows you to apply a function or a list of functions to a DataFrame or Series. This can be done across a row or column, depending on the specified axis. The function is versatile, supporting built-in operations like sum, mean, max, min, and also custom functions.

Pandas agg Recommended Articles

Basic Syntax of pandas agg

The basic syntax of the agg function is as follows:

DataFrame.agg(func, axis=0, *args, **kwargs)
  • func: Function, list of functions, or dict of column names to functions.
  • axis: {0 or ‘index’, 1 or ‘columns’}, default 0. If 0 or ‘index’: apply function to each column. If 1 or ‘columns’: apply function to each row.
  • args, kwargs: Arguments to pass to the function.

Examples of Using pandas agg Function

Example 1: Applying a Single Function

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Apply sum function using agg
result = df.agg('sum')
print(result)

Output:

Pandas agg

Example 2: Applying Multiple Functions

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Apply multiple functions using agg
result = df.agg(['sum', 'mean'])
print(result)

Output:

Pandas agg

Example 3: Applying Different Functions to Different Columns

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Apply different functions to different columns using agg
result = df.agg({'A': 'sum', 'B': 'mean'})
print(result)

Output:

Pandas agg

Example 4: Using Custom Functions

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pandasdataframe.com

# Define a custom function
def increment(x):
    return x + 1

# Apply custom function using agg
result = df.agg(increment)
print(result)

Example 5: Aggregation with Conditions

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Define a custom function with condition
def custom_sum(series):
    return series[series > 1].sum()

# Apply custom function using agg
result = df.agg(custom_sum)
print(result)

Output:

Pandas agg

Example 6: Using agg with GroupBy

import pandas as pd

# Create a DataFrame
data = {'Group': ['A', 'A', 'B', 'B'], 'Value': [10, 15, 10, 20]}
df = pd.DataFrame(data)

# Group by 'Group' column and apply sum
grouped = df.groupby('Group')
result = grouped.agg('sum')
print(result)

Output:

Pandas agg

Example 7: Multiple Aggregations on GroupBy Object

import pandas as pd

# Create a DataFrame
data = {'Group': ['A', 'A', 'B', 'B'], 'Value': [10, 15, 10, 20]}
df = pd.DataFrame(data)

# Group by 'Group' column and apply multiple aggregations
grouped = df.groupby('Group')
result = grouped.agg(['sum', 'mean', 'max'])
print(result)

Output:

Pandas agg

Example 8: Aggregating with Custom Functions on GroupBy Object

import pandas as pd

# Create a DataFrame
data = {'Group': ['A', 'A', 'B', 'B'], 'Value': [10, 15, 10, 20]}
df = pd.DataFrame(data)

# Define a custom function
def range_func(x):
    return x.max() - x.min()

# Group by 'Group' column and apply custom function
grouped = df.groupby('Group')
result = grouped.agg(range_func)
print(result)

Output:

Pandas agg

Example 9: Using Lambda Functions in agg

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Apply lambda function using agg
result = df.agg(lambda x: x * 2)
print(result)

Output:

Pandas agg

Example 10: Aggregating with Multiple Lambda Functions

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Apply multiple lambda functions using agg
result = df.agg({'A': lambda x: x.max() - x.min(), 'B': lambda x: x.sum()})
print(result)

Output:

Pandas agg

Pandas agg Conclusion

The agg function in Pandas is a versatile tool that allows for complex data aggregation operations. Whether you’re applying a single function, multiple functions, or even custom functions, agg can handle it efficiently. By understanding how to use this function effectively, you can perform a wide range of data analysis tasks more efficiently.