Pandas Apply Example

Pandas Apply Example

Pandas is a powerful data manipulation library in Python that offers a variety of functions to perform complex data transformations easily. One of the most versatile functions provided by Pandas is apply(). This function allows you to apply a function along an axis of the DataFrame or to elements of Series. This article will explore numerous examples of using the apply() function to demonstrate its flexibility and utility in data analysis.

Introduction to apply()

The apply() function in Pandas can be used on both Series and DataFrame objects. When used on a DataFrame, you can apply a function along each column or row. The basic syntax of apply() is:

DataFrame.apply(func, axis=0, raw=False, result_type=None, args=(), **kwds)
  • func: The function to apply to each column or row.
  • axis: {0 or ‘index’, 1 or ‘columns’}, default 0. Axis along which the function is applied:
    • 0 or ‘index’: apply function to each column.
    • 1 or ‘columns’: apply function to each row.
  • raw: Determines if rows passed to function as Series or as ndarray objects.
  • result_type: Determines the type of the results, can be expand, reduce, or broadcast.

Example 1: Basic Usage on a DataFrame

Let’s start with a simple example where we apply a function to each column of a DataFrame that calculates the range (max – min) of entries.

import pandas as pd

# Create a sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

# Define a simple range function
def range_func(x):
    return x.max() - x.min()

# Apply function
result = df.apply(range_func)
print(result)

Output:

Pandas Apply Example

Example 2: Applying a Function to Rows

You can apply a function to each row of a DataFrame by setting axis=1.

import pandas as pd

# Create a sample DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

# Define a function that sums the values of each row
def sum_row(row):
    return sum(row)

# Apply function to each row
result = df.apply(sum_row, axis=1)
print(result)

Output:

Pandas Apply Example

Example 3: Using Lambda Functions

Lambda functions are anonymous functions defined with the lambda keyword. They are handy when used with apply().

import pandas as pd

# Create a DataFrame
data = {'A': [10, 20, 30], 'B': [40, 50, 60], 'C': [70, 80, 90]}
df = pd.DataFrame(data)

# Apply a lambda function to each column
result = df.apply(lambda x: x ** 2)
print(result)

Output:

Pandas Apply Example

Example 4: Applying Functions that Return Multiple Values

When you want to return multiple new columns from an apply() operation, you can do so by specifying result_type='expand'.

import pandas as pd

# Create a DataFrame
data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)

# Define a function that returns multiple values
def multiple_values(row):
    return pd.Series([row['Age'] * 2, row['Age'] + 10])

# Apply function
result = df.apply(multiple_values, axis=1, result_type='expand')
result.columns = ['AgeDoubled', 'AgePlusTen']
print(result)

Output:

Pandas Apply Example

Example 5: Applying a Function with Additional Arguments

You can pass additional arguments to the function being applied using the args parameter.

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Define a function that uses additional arguments
def multiply_by_factor(x, factor):
    return x * factor

# Apply function with additional arguments
result = df.apply(multiply_by_factor, args=(10,))
print(result)

Output:

Pandas Apply Example

Example 6: Using apply() with a Complex Function

apply() can be used with more complex functions that perform extensive computations or data manipulations.

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6], 'C': [7, 8, 9]}
df = pd.DataFrame(data)

# Define a complex function
def complex_function(row):
    if row['A'] > 1:
        return row['B'] * 2
    else:
        return row['C'] * 3

# Apply complex function to each row
result = df.apply(complex_function, axis=1)
print(result)

Output:

Pandas Apply Example

Example 7: Applying Functions Conditionally

You can apply functions conditionally, i.e., the function logic can depend on some condition related to the data.

import pandas as pd

# Create a DataFrame
data = {'A': [10, 20, 30], 'B': [20, 30, 40]}
df = pd.DataFrame(data)

# Define a conditional function
def conditional_apply(x):
    if x['A'] > 15:
        return x['A'] + x['B']
    else:
        return x['A'] - x['B']

# Apply conditional function
result = df.apply(conditional_apply, axis=1)
print(result)

Output:

Pandas Apply Example

Example 8: Using apply() with Global Variables

Sometimes, you might need to use global variables within the function applied. This should be done with caution to avoid side effects.

import pandas as pd

# Global variable
factor = 3

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Define a function that uses a global variable
def use_global(x):
    return x * factor

# Apply function using a global variable
result = df.apply(use_global)
print(result)

Output:

Pandas Apply Example

Example 9: Modifying Data Within apply()

While it’s generally recommended to avoid modifying data within apply() due to potential side effects and performance issues, it can be done if necessary.

import pandas as pd

# Create a DataFrame
data = {'A': [1, 2, 3], 'B': [4, 5, 6]}
df = pd.DataFrame(data)

# Define a function that modifies the DataFrame
def modify_data(x):
    x['A'] = x['A'] ** 2
    return x

# Apply function that modifies data
result = df.apply(modify_data, axis=1)
print(result)

Output:

Pandas Apply Example

Pandas Apply Example Conclusion

The apply() function in Pandas is a powerful tool for applying functions to Series or DataFrame objects. It provides flexibility in data manipulation and can be used for a wide range of data transformation tasks. By understanding how to use apply() effectively, you can perform complex data manipulations succinctly and efficiently.