Pandas DataFrame loc or Operator
Pandas is a powerful data manipulation library in Python. It provides data structures and functions needed to manipulate structured data. One of the most commonly used data structures in pandas is the DataFrame. It is a two-dimensional labeled data structure with columns of potentially different types.
In this article, we will focus on the loc
operator in pandas DataFrame, and how to use the or
operator with it. The loc
operator is used for label-based indexing or selecting data from a DataFrame. It can accept a single label, a list of labels, or a boolean array.
1. Basic Usage of loc Operator
The loc
operator is used to access a group of rows and columns by labels or a boolean array. Here is a basic example:
import pandas as pd
data = {
'name': ['pandasdataframe.com', 'example.com', 'test.com'],
'visits': [100, 200, 300],
'revenue': [10.0, 20.0, 30.0]
}
df = pd.DataFrame(data)
# Access the row with label 1
print(df.loc[1])
Output:
2. Using loc with Boolean Array
You can also use a boolean array with the loc
operator to select rows. Here is an example:
import pandas as pd
data = {
'name': ['pandasdataframe.com', 'example.com', 'test.com'],
'visits': [100, 200, 300],
'revenue': [10.0, 20.0, 30.0]
}
df = pd.DataFrame(data)
# Select rows where visits is greater than 100
print(df.loc[df['visits'] > 100])
Output:
3. Using loc to Select Columns
The loc
operator can also be used to select columns. Here is an example:
import pandas as pd
data = {
'name': ['pandasdataframe.com', 'example.com', 'test.com'],
'visits': [100, 200, 300],
'revenue': [10.0, 20.0, 30.0]
}
df = pd.DataFrame(data)
# Select the 'name' and 'visits' columns
print(df.loc[:, ['name', 'visits']])
Output:
4. Using loc to Select Rows and Columns
You can use the loc
operator to select both rows and columns at the same time. Here is an example:
import pandas as pd
data = {
'name': ['pandasdataframe.com', 'example.com', 'test.com'],
'visits': [100, 200, 300],
'revenue': [10.0, 20.0, 30.0]
}
df = pd.DataFrame(data)
# Select the 'name' and 'visits' columns for rows where visits is greater than 100
print(df.loc[df['visits'] > 100, ['name', 'visits']])
Output:
5. Using the or Operator with loc
The or
operator can be used with the loc
operator to select rows that satisfy either of two conditions. Here is an example:
import pandas as pd
data = {
'name': ['pandasdataframe.com', 'example.com', 'test.com'],
'visits': [100, 200, 300],
'revenue': [10.0, 20.0, 30.0]
}
df = pd.DataFrame(data)
# Select rows where visits is greater than 100 or revenue is greater than 10.0
print(df.loc[(df['visits'] > 100) | (df['revenue'] > 10.0)])
Output:
6. Using the or Operator with loc to Select Columns
The or
operator can also be used with the loc
operator to select columns. Here is an example:
import pandas as pd
data = {
'name': ['pandasdataframe.com', 'example.com', 'test.com'],
'visits': [100, 200, 300],
'revenue': [10.0, 20.0, 30.0]
}
df = pd.DataFrame(data)
# Select the 'name' column or the 'visits' column
print(df.loc[:, df.columns.isin(['name', 'visits'])])
Output:
7. Using the or Operator with loc to Select Rows and Columns
You can use the or
operator with the loc
operator to select both rows and columns that satisfy either of two conditions. Here is an example:
import pandas as pd
data = {
'name': ['pandasdataframe.com', 'example.com', 'test.com'],
'visits': [100, 200, 300],
'revenue': [10.0, 20.0, 30.0]
}
df = pd.DataFrame(data)
# Select the 'name' and 'visits' columns for rows where visits is greater than 100 or revenue is greater than 10.0
print(df.loc[(df['visits'] > 100) | (df['revenue'] > 10.0), ['name', 'visits']])
Output:
In conclusion, the loc
operator is a powerful tool for selecting data from a pandas DataFrame. When combined with the or
operator, it allows for complex data selection based on multiple conditions.