Pandas Concat

Pandas Concat

Pandas is a powerful data manipulation library in Python, widely used in data analysis and data science. One of the essential functions provided by pandas is concat, which is used to concatenate pandas objects along a particular axis with optional set logic along the other axes. This function can concatenate Series, DataFrame, or Panel objects.

This article will explore the concat function in detail, providing a comprehensive guide on its usage with various examples. Each example will be standalone, ensuring that you can run them independently without any dependencies on previous code snippets.

Understanding Pandas Concat Function

The concat function in pandas is primarily used to combine data from different DataFrame or Series objects into a single DataFrame. The syntax for the concat function is:

pandas.concat(objs, axis=0, join='outer', ignore_index=False, keys=None, levels=None, names=None, verify_integrity=False, sort=False, copy=True)
  • objs: This is a sequence or mapping of Series or DataFrame objects.
  • axis: The axis to concatenate along.
  • join: How to handle indexes on other axis(es).
  • ignore_index: If True, do not use the index values on the concatenation axis.
  • keys: Construct hierarchical index using the passed keys.
  • levels: Specific levels (unique values) to use for constructing a MultiIndex.
  • names: Names for the levels in the resulting hierarchical index.
  • verify_integrity: Check whether the new concatenated axis contains duplicates.
  • sort: Sort non-concatenation axis if it is not already aligned.
  • copy: If False, do not copy data unnecessarily.

Examples of Using Pandas Concat

Example 1: Basic Concatenation of Two DataFrames

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3'],
    'C': ['C0', 'C1', 'C2', 'C3'],
    'D': ['D0', 'D1', 'D2', 'D3']
}, index=[0, 1, 2, 3])

df2 = pd.DataFrame({
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7'],
    'C': ['C4', 'C5', 'C6', 'C7'],
    'D': ['D4', 'D5', 'D6', 'D7']
}, index=[4, 5, 6, 7])

result = pd.concat([df1, df2])
print(result)

Output:

Pandas Concat

Example 2: Concatenation with Axis Set to 1

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'C': ['C0', 'C1', 'C2', 'C3'],
    'D': ['D0', 'D1', 'D2', 'D3']
})

result = pd.concat([df1, df2], axis=1)
print(result)

Output:

Pandas Concat

Example 3: Ignoring the Index

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7']
})

result = pd.concat([df1, df2], ignore_index=True)
print(result)

Output:

Pandas Concat

Example 4: Adding MultiIndex Keys

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7']
})

result = pd.concat([df1, df2], keys=['df1', 'df2'])
print(result)

Output:

Pandas Concat

Example 5: Concatenation with Different Columns

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'B': ['B4', 'B5', 'B6', 'B7'],
    'C': ['C4', 'C5', 'C6', 'C7']
})

result = pd.concat([df1, df2], sort=False)
print(result)

Output:

Pandas Concat

Example 6: Using join Parameter

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'B': ['B4', 'B5', 'B6', 'B7'],
    'C': ['C4', 'C5', 'C6', 'C7']
})

result = pd.concat([df1, df2], join='inner')
print(result)

Output:

Pandas Concat

Example 7: Concatenation with Different Indexes

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
}, index=[0, 1, 2, 3])

df2 = pd.DataFrame({
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7']
}, index=[4, 5, 6, 7])

result = pd.concat([df1, df2])
print(result)

Output:

Pandas Concat

Example 8: Concatenation with Non-Overlapping Indexes

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
}, index=[0, 1, 2, 3])

df2 = pd.DataFrame({
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7']
}, index=[8, 9, 10, 11])

result = pd.concat([df1, df2])
print(result)

Output:

Pandas Concat

Example 9: Concatenation with Hierarchical Index

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
}, index=[0, 1, 2, 3])

df2 = pd.DataFrame({
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7']
}, index=[4, 5, 6, 7])

result = pd.concat([df1, df2], keys=['Group1', 'Group2'])
print(result)

Output:

Pandas Concat

Example 10: Concatenation with Mixed DataTypes

import pandas as pd

df1 = pd.DataFrame({
    'A': [1, 2, 3, 4],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'A': [5, 6, 7, 8],
    'B': ['B4', 'B5', 'B6', 'B7']
})

result = pd.concat([df1, df2])
print(result)

Output:

Pandas Concat

Example 11: Concatenation and Retaining the Original Index

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7']
})

result = pd.concat([df1, df2], ignore_index=False)
print(result)

Output:

Pandas Concat

Example 12: Concatenation with DataFrames Having Different Shapes

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2'],
    'B': ['B0', 'B1', 'B2']
})

df2 = pd.DataFrame({
    'A': ['A3', 'A4', 'A5', 'A6'],
    'B': ['B3', 'B4', 'B5', 'B6'],
    'C': ['C3', 'C4', 'C5', 'C6']
})

result = pd.concat([df1, df2], sort=False)
print(result)

Output:

Pandas Concat

Example 13: Concatenation Using sort Parameter

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'C': ['C0', 'C1', 'C2', 'C3']
})

df2 = pd.DataFrame({
    'B': ['B4', 'B5', 'B6', 'B7'],
    'C': ['C4', 'C5', 'C6', 'C7']
})

result = pd.concat([df1, df2], sort=True)
print(result)

Output:

Pandas Concat

Example 14: Concatenation with copy Parameter Set to False

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7']
})

result = pd.concat([df1, df2], copy=False)
print(result)

Output:

Pandas Concat

Example 15: Concatenation with Multiple DataFrames

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'A': ['A4', 'A5', 'A6', 'A7'],
    'B': ['B4', 'B5', 'B6', 'B7']
})

df3 = pd.DataFrame({
    'A': ['A8', 'A9', 'A10', 'A11'],
    'B': ['B8', 'B9', 'B10', 'B11']
})

result = pd.concat([df1, df2, df3])
print(result)

Output:

Pandas Concat

Example 16: Concatenation with Different Column Orders

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'B': ['B4', 'B5', 'B6', 'B7'],
    'A': ['A4', 'A5', 'A6', 'A7']
})

result = pd.concat([df1, df2])
print(result)

Output:

Pandas Concat

Example 17: Concatenation with DataFrame and Series

import pandas as pd

df = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

s = pd.Series(['S1', 'S2', 'S3', 'S4'], name='S')

result = pd.concat([df, s], axis=1)
print(result)

Output:

Pandas Concat

Example 18: Concatenation with Handling of NaN Values

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
})

df2 = pd.DataFrame({
    'C': ['C4', 'C5', 'C6', 'C7'],
    'D': ['D4', 'D5', 'D6', 'D7']
})

result = pd.concat([df1, df2], sort=False)
print(result)

Output:

Pandas Concat

Example 19: Concatenation with Custom Index Names

import pandas as pd

df1 = pd.DataFrame({
    'A': ['A0', 'A1', 'A2', 'A3'],
    'B': ['B0', 'B1', 'B2', 'B3']
}, index=['pandasdataframe.com1', 'pandasdataframe.com2', 'pandasdataframe.com3', 'pandasdataframe.com4'])

df2 = pd.DataFrame({
    'C': ['C4', 'C5', 'C6', 'C7'],
    'D': ['D4', 'D5', 'D6', 'D7']
}, index=['pandasdataframe.com5', 'pandasdataframe.com6', 'pandasdataframe.com7', 'pandasdataframe.com8'])

result = pd.concat([df1, df2], axis=1)
print(result)

Output:

Pandas Concat

These examples cover a wide range of scenarios where the Pandas concat function can be used effectively to manipulate and combine data in pandas. Each example is designed to be self-contained and executable, providing a practical understanding of how to use concat in different contexts.