Pandas Append Two DataFrames

Pandas Append Two DataFrames

Pandas is a powerful Python library used for data manipulation and analysis. One of the common tasks when working with data is combining datasets. This can be done in various ways, including concatenation, merging, and appending. In this article, we will focus on the append() function in Pandas, which allows you to combine two DataFrame objects by adding the rows of one DataFrame to another.

Introduction to DataFrame Append

The append() function in Pandas is used to concatenate along the axis=0, i.e., the index. This function returns a new DataFrame consisting of the original DataFrames stacked one on top of the other. It is important to note that this function does not change the original DataFrames but returns a new DataFrame.

Basic Syntax of append()

The basic syntax of the append() function is as follows:

DataFrame.append(other, ignore_index=False, verify_integrity=False, sort=False)
  • other: The DataFrame or series/dict-like object to append.
  • ignore_index: If True, the index labels are not used in the resulting DataFrame. Instead, it will be labeled as 0, 1, …, n-1.
  • verify_integrity: If True, checks if appending will create duplicate index values.
  • sort: Sort columns if the columns of self and other are not aligned.

Examples of Appending DataFrames

Let’s explore several examples to understand how to use the append() function effectively. Each example will include complete, standalone code that can be run independently.

Example 1: Basic DataFrame Append

import pandas as pd

# Create two DataFrames
df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2'],
    'B': ['B0', 'B1', 'B2']
}, index=[0, 1, 2])

df2 = pd.DataFrame({
    'A': ['A3', 'A4', 'A5'],
    'B': ['B3', 'B4', 'B5']
}, index=[3, 4, 5])

# Append df2 to df1
result = df1._append(df2)
print(result)

Output:

Pandas Append Two DataFrames

Example 2: Append with Ignore Index

import pandas as pd

# Create two DataFrames
df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2'],
    'B': ['B0', 'B1', 'B2']
}, index=[0, 1, 2])

df2 = pd.DataFrame({
    'A': ['A3', 'A4', 'A5'],
    'B': ['B3', 'B4', 'B5']
}, index=[3, 4, 5])

# Append df2 to df1 and ignore the index
result = df1._append(df2, ignore_index=True)
print(result)

Output:

Pandas Append Two DataFrames

Example 3: Append with Column Mismatch

import pandas as pd

# Create two DataFrames with different columns
df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2'],
    'B': ['B0', 'B1', 'B2']
}, index=[0, 1, 2])

df2 = pd.DataFrame({
    'C': ['C3', 'C4', 'C5'],
    'D': ['D3', 'D4', 'D5']
}, index=[3, 4, 5])

# Append df2 to df1
result = df1._append(df2, sort=True)
print(result)

Output:

Pandas Append Two DataFrames

Example 4: Append Using a Dict

import pandas as pd

# Create a DataFrame
df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2'],
    'B': ['B0', 'B1', 'B2']
}, index=[0, 1, 2])

# Append using a dictionary
result = df1._append({'A': 'A3', 'B': 'B3'}, ignore_index=True)
print(result)

Output:

Pandas Append Two DataFrames

Example 5: Append Multiple DataFrames

import pandas as pd

# Create three DataFrames
df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2'],
    'B': ['B0', 'B1', 'B2']
}, index=[0, 1, 2])

df2 = pd.DataFrame({
    'A': ['A3', 'A4', 'A5'],
    'B': ['B3', 'B4', 'B5']
}, index=[3, 4, 5])

df3 = pd.DataFrame({
    'A': ['A6', 'A7', 'A8'],
    'B': ['B6', 'B7', 'B8']
}, index=[6, 7, 8])

# Append df2 and df3 to df1
result = df1._append([df2, df3])
print(result)

Output:

Pandas Append Two DataFrames

Pandas Append Two DataFrames Conclusion

In this article, we explored how to use the append() function in Pandas to combine two or more DataFrames. This function is particularly useful when you need to stack DataFrames vertically. We covered various scenarios including ignoring indexes, handling column mismatches, and appending using dictionaries. Each example provided is self-contained and can be executed independently to demonstrate the functionality of DataFrame appending in Pandas.

By understanding these examples, you can effectively manage and manipulate your data in Python using Pandas, making your data analysis tasks more efficient and streamlined.