Pandas Append Row to DataFrame
Pandas is a powerful library in Python for data manipulation and analysis. One of the common tasks when working with data is adding rows to a DataFrame. In this article, we will explore various methods to append rows to a DataFrame in Pandas. We will cover different scenarios and provide detailed example codes for each method.
Introduction
Appending rows to a DataFrame is a fundamental operation in data manipulation. This operation is often required when collecting data from different sources, aggregating data, or simply updating existing data structures. Pandas provides multiple ways to append rows, each with its advantages and considerations.
Creating a DataFrame
Before diving into the methods of appending rows, let’s start by creating a simple DataFrame that we will use in our examples.
import pandas as pd
# Creating a simple DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
print("Initial DataFrame:")
print(df)
Output:
In this example, we create a DataFrame with three columns: Name
, Age
, and City
. The DataFrame contains two rows with the respective data.
Using append()
Method
The append()
method is one of the most straightforward ways to add rows to a DataFrame. It allows you to append rows from another DataFrame, a dictionary, or a list of dictionaries.
Example 1: Appending a Row from a Dictionary
import pandas as pd
# Creating the initial DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
# Creating a dictionary for the new row
new_row = {'Name': 'Charlie', 'Age': 35, 'City': 'Chicago'}
# Appending the new row
df = df._append(new_row, ignore_index=True)
print("DataFrame after appending a new row:")
print(df)
Output:
In this example, we create a dictionary representing the new row and use the append()
method to add it to the DataFrame. The ignore_index=True
parameter ensures that the index is reset.
Example 2: Appending Rows from Another DataFrame
import pandas as pd
# Creating the initial DataFrame
df1 = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
# Creating another DataFrame to append
df2 = pd.DataFrame({
'Name': ['Charlie', 'David'],
'Age': [35, 40],
'City': ['Chicago', 'Houston']
})
# Appending df2 to df1
df1 = df1._append(df2, ignore_index=True)
print("DataFrame after appending another DataFrame:")
print(df1)
Output:
Here, we create a second DataFrame df2
and append it to the first DataFrame df1
using the append()
method.
Using loc[]
Method
The loc[]
method is used to access a group of rows and columns by labels or a boolean array. It can also be used to append a new row by specifying the index and the row data.
Example 3: Appending a Row with loc[]
import pandas as pd
# Creating the initial DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
# Appending a new row using loc
df.loc[2] = ['Charlie', 35, 'Chicago']
print("DataFrame after appending a new row using loc:")
print(df)
Output:
In this example, we use the loc[]
method to append a new row by specifying the new index and the row data. The index 2
is the new row’s index.
Using iloc[]
Method
The iloc[]
method is used for integer-location based indexing for selection by position. Similar to loc[]
, it can be used to append rows by position.
Using concat()
Function
The concat()
function in Pandas is used to concatenate pandas objects along a particular axis. It can also be used to append rows to a DataFrame.
Example 4: Appending a Row with concat()
import pandas as pd
# Creating the initial DataFrame
df1 = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
# Creating another DataFrame to append
df2 = pd.DataFrame([{
'Name': 'Charlie',
'Age': 35,
'City': 'Chicago'
}])
# Using concat to append the new row
df = pd.concat([df1, df2], ignore_index=True)
print("DataFrame after appending a new row using concat:")
print(df)
Output:
In this example, we create a second DataFrame df2
with a single row and use the concat()
function to append it to df1
.
Using assign()
Method
The assign()
method adds new columns to a DataFrame. While it’s not typically used for appending rows, it can be creatively used to achieve this.
Example 5: Appending a Row with assign()
import pandas as pd
# Creating the initial DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
# Creating a DataFrame for the new row
new_row = pd.DataFrame({
'Name': ['Charlie'],
'Age': [35],
'City': ['Chicago']
})
# Appending the new row using assign
df = df.assign(dummy=1)._append(new_row.assign(dummy=1), ignore_index=True).drop(columns='dummy')
print("DataFrame after appending a new row using assign:")
print(df)
Output:
Here, we use the assign()
method to create a temporary column, append the new row, and then drop the temporary column.
Using a List of Dictionaries
A list of dictionaries can be used to append multiple rows to a DataFrame.
Example 6: Appending Rows with a List of Dictionaries
import pandas as pd
# Creating the initial DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
# List of dictionaries representing new rows
new_rows = [
{'Name': 'Charlie', 'Age': 35, 'City': 'Chicago'},
{'Name': 'David', 'Age': 40, 'City': 'Houston'}
]
# Appending new rows
df = df._append(new_rows, ignore_index=True)
print("DataFrame after appending new rows using a list of dictionaries:")
print(df)
Output:
In this example, we use a list of dictionaries to represent new rows and append them to the DataFrame using the append()
method.
Using a List of Lists
A list of lists can also be used to append multiple rows, though it requires specifying column names.
Example 7: Appending Rows with a List of Lists
import pandas as pd
# Creating the initial DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
# List of lists representing new rows
new_rows = [
['Charlie', 35, 'Chicago'],
['David', 40, 'Houston']
]
# Appending new rows using a loop
for row in new_rows:
df.loc[len(df)] = row
print("DataFrame after appending new rows using a list of lists:")
print(df)
Output:
Here, we use a loop to iterate through the list of lists and append each row using the loc[]
method.
Appending Rows in a Loop
Appending rows in a loop is a common practice when the data to be appended is generated or processed in iterations.
Example 8: Appending Rows in a Loop
import pandas as pd
# Creating the initial DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
# Data to append in a loop
names = ['Charlie', 'David']
ages = [35, 40]
cities = ['Chicago', 'Houston']
# Appending rows in a loop
for name, age, city in zip(names, ages, cities):
df.loc[len(df)] = [name, age, city]
print("DataFrame after appending new rows in a loop:")
print(df)
Output:
In this example, we use a loop to append multiple rows by iterating through the data.
Performance Considerations
While appending rows is a common operation, it’s important to consider performance, especially with large datasets. Repeatedly appending rows can be inefficient due to the constant memory reallocation. Instead, consider collecting rows in a list and creating a DataFrame at the end.
Example 9: Efficient Appending
import pandas as pd
# Initial DataFrame
df = pd.DataFrame({
'Name': ['Alice', 'Bob'],
'Age': [25, 30],
'City': ['New York', 'Los Angeles']
})
# List to collect new rows
rows_to_append = [
['Charlie', 35, 'Chicago'],
['David', 40, 'Houston']
]
# Create a new DataFrame from the list of rows
new_df = pd.DataFrame(rows_to_append, columns=['Name', 'Age', 'City'])
# Append the new DataFrame to the existing one
df = pd.concat([df, new_df], ignore_index=True)
print("DataFrame after efficiently appending new rows:")
print(df)
Output:
In this example, we collect rows in a list, convert the list to a DataFrame, and then use concat()
to append the new DataFrame, which is more efficient.
Pandas Append Row to DataFrame Conclusion
Appending rows to a DataFrame in Pandas can be done in multiple ways, each suited to different scenarios. Whether using append()
, loc[]
, iloc[]
, concat()
, or other methods, understanding the options available allows for more efficient and effective data manipulation. Consider performance implications and choose the method that best fits the task at hand.