Pandas DataFrame loc and iloc

Pandas DataFrame loc and iloc

Pandas is a powerful data manipulation library in Python that offers a variety of methods to slice, dice, and generally manipulate data. Two of the most useful indexing functions in pandas are loc and iloc. These functions are used to access a group of rows and columns by labels or a boolean array. loc is label-based, which means that you have to specify rows and columns based on their row and column labels. iloc is integer index-based, so you specify rows and columns by their integer index.

This article will provide a comprehensive guide on how to use loc and iloc in pandas, including detailed examples to illustrate their usage.

Understanding loc in Pandas

The loc attribute allows indexing and slicing that always references the explicit index (the labels of the rows and columns). It can accept:

  • A single label
  • A list or array of labels
  • A slice object with labels
  • A boolean array

Example 1: Selecting a single row by index label

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])
result = df.loc['a']
print(result)

Output:

Pandas DataFrame loc and iloc

Example 2: Selecting multiple rows by index label

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])
result = df.loc[['a', 'b']]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 3: Selecting rows by a slice of index labels

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])
result = df.loc['a':'b']
print(result)

Output:

Pandas DataFrame loc and iloc

Example 4: Selecting rows and columns

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])
result = df.loc['a':'b', 'Name']
print(result)

Output:

Pandas DataFrame loc and iloc

Example 5: Using boolean arrays

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])
result = df.loc[df['Age'] > 25]
print(result)

Output:

Pandas DataFrame loc and iloc

Understanding iloc in Pandas

The iloc indexer for Pandas Dataframe is used for integer-location based indexing / selection by position. It can accept:

  • An integer
  • A list or array of integers
  • A slice object with integers

Example 6: Selecting a single row by integer index

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.iloc[0]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 7: Selecting multiple rows by integer index

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.iloc[[0, 1]]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 8: Selecting rows by a slice of integer indices

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.iloc[0:2]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 9: Selecting rows and columns by integer index

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pandasdataframe.com.DataFrame(data)
result = df.iloc[0:2, 1]
print(result)

Example 10: Using boolean arrays with iloc

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.iloc[(df['Age'] > 25).values]
print(result)

Output:

Pandas DataFrame loc and iloc

Practical Examples and Use Cases

Example 11: Filtering rows based on conditions

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.loc[df['Age'] > 25]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 12: Selecting specific rows and columns

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.loc[df['Age'] > 25, ['Name']]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 13: Modifying a subset of a DataFrame

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
df.loc[df['Age'] > 25, 'Age'] = 40
print(df)

Output:

Pandas DataFrame loc and iloc

Example 14: Using iloc to reorder rows

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.iloc[[2, 1, 0]]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 15: Combining loc and iloc for complex scenarios

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])
result = df.loc['a':'b'].iloc[:, 1]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 16: Using loc with a callable

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data, index=['a', 'b', 'c'])
result = df.loc[lambda df: df['Age'] > 25]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 17: Using iloc with a callable

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.iloc[lambda df: [0, 2]]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 18: Selecting rows based on custom criteria

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.loc[df['Name'].str.startswith('A')]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 19: Using iloc to select columns

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.iloc[:, [1]]
print(result)

Output:

Pandas DataFrame loc and iloc

Example 20: Complex boolean indexing with loc

import pandas as pd

data = {'Name': ['Alice', 'Bob', 'Charlie'], 'Age': [25, 30, 35]}
df = pd.DataFrame(data)
result = df.loc[(df['Age'] > 25) & (df['Name'].str.contains('o'))]
print(result)

Output:

Pandas DataFrame loc and iloc

In conclusion, loc and iloc are versatile tools in pandas that allow for efficient data selection and manipulation. By understanding how to use these indexing methods, you can handle a wide range of data processing tasks more effectively.