Pandas loc and iloc

Pandas loc and iloc

Pandas is a powerful Python library used for data manipulation and analysis. It provides numerous functionalities that make it easy to clean, analyze, and visualize data. Two of the most useful functionalities provided by Pandas are loc and iloc for data selection. This article will delve into the details of these two functions, providing a comprehensive guide on how to use them effectively in various scenarios.

Understanding loc

The loc attribute is used to access a group of rows and columns by labels or a boolean array. loc primarily works with the labels of the index or column names.

Basic Usage of loc

Here is an example of how to use loc to select a single row from a DataFrame:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.loc[0])

Output:

Pandas loc and iloc

Selecting Multiple Rows

loc can also be used to select multiple rows. Here’s how you can select multiple specific rows:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.loc[[0, 2]])

Output:

Pandas loc and iloc

Selecting Rows with a Condition

You can use loc to select rows based on a condition. Here’s an example:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.loc[df['Age'] > 25])

Output:

Pandas loc and iloc

Selecting Specific Columns

With loc, you can specify the columns you want to select:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.loc[:, ['Name', 'Email']])

Output:

Pandas loc and iloc

Combining Conditions

You can combine conditions to make more complex queries:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.loc[(df['Age'] > 25) & (df['Name'] == 'Charlie')])

Output:

Pandas loc and iloc

Understanding iloc

The iloc attribute is used to access a group of rows and columns by integer position(s). iloc is integer-based, so you specify rows and columns by their integer position.

Basic Usage of iloc

Here is how you can use iloc to select the first row of a DataFrame:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.iloc[0])

Output:

Pandas loc and iloc

Selecting Multiple Rows

iloc can select multiple rows by passing a list of row indices:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.iloc[[0, 2]])

Output:

Pandas loc and iloc

Selecting Specific Columns

You can select specific columns by their integer position:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.iloc[:, [0, 2]])

Output:

Pandas loc and iloc

Slicing Rows and Columns

You can slice both rows and columns with iloc:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.iloc[0:2, 0:2])

Output:

Pandas loc and iloc

Selecting Rows and Columns Simultaneously

With iloc, you can select specific rows and columns simultaneously:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.iloc[[0, 2], [1, 2]])

Output:

Pandas loc and iloc

Practical Examples Combining loc and iloc

Updating Data

You can update data in a DataFrame using both loc and iloc. Here’s an example using loc:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

df.loc[0, 'Age'] = 26
print(df)

Output:

Pandas loc and iloc

And here’s how you can do it with iloc:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

df.iloc[0, 1] = 27
print(df)

Output:

Pandas loc and iloc

Deleting Rows

To delete rows using loc, you can use a condition:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

df = df.loc[df['Age'] > 25]
print(df)

Output:

Pandas loc and iloc

Adding a New Column

You can add a new column using loc like this:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

df.loc[:, 'Location'] = 'Unknown'
print(df)

Output:

Pandas loc and iloc

Complex Conditions

Handling complex conditions with loc:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.loc[(df['Age'] > 25) | (df['Name'].str.contains('Bob'))])

Output:

Pandas loc and iloc

Using iloc with Functions

You can use iloc alongside functions to perform operations. For example, to get the last two rows:

import pandas as pd

data = {
    'Name': ['Alice', 'Bob', 'Charlie'],
    'Age': [25, 30, 35],
    'Email': ['[email protected]', '[email protected]', '[email protected]']
}
df = pd.DataFrame(data)

print(df.iloc[-2:])

Output:

Pandas loc and iloc

Pandas loc and iloc Conclusion

The loc and iloc functions in Pandas provide robust capabilities for data selection and manipulation. By understanding how to use these tools effectively, you can handle a wide range of data processing tasks more efficiently. Whether you are filtering data, selecting specific elements, or modifying a DataFrame, loc and iloc are indispensable tools in the arsenal of any data scientist or analyst working with Python.