Advertisement
Working with data often means sifting through a large set of numbers, names, or categories to focus on what matters. In pandas, that’s where filtering by column values comes in. It lets you narrow your DataFrame to include just the rows you want—nothing more, nothing less. Whether you’re cleaning up a dataset, selecting a group for analysis, or preparing data for visualization, filtering is a regular task. Thankfully, pandas make it pretty smooth.
Let’s go over some of the practical ways to filter rows based on what’s in a column. You’ll see how to use conditions, match values, apply functions, and more.
Perhaps the most straightforward way to filter is through the use of conditions. If you've used pandas in the past, you've likely already encountered this technique. It's popular because it simply works, and it's easy to understand.
python
CopyEdit
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40]
})
# Filter rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
This returns you all the rows in which the 'Age' column has a value greater than 30. You can replace other comparison operators as well—less than, equals, not equals, etc. The same principle can be used with strings:
python
CopyEdit
# Names that are not 'Bob'
df[df['Name'] != 'Bob']
These are basic conditionals, and you can chain them using logical operators if you need more than one condition.
If you wish to apply more than one filter, you'll have to use & for AND, or for OR, and enclose each condition in parentheses. pandas does not recognize regular Python's and or or within DataFrame filtering.
python
CopyEdit
# Age over 30 AND name starts with 'D'
df[(df['Age'] > 30) & (df['Name'].str.startswith('D'))]
This makes it easy to layer on multiple filters without writing a loop or separate functions. Just be sure to use parentheses, or pandas will throw an error.
If you’re working with many filters, or they change often, you might find it cleaner to build them in separate lines and combine later:
python
CopyEdit
condition1 = df['Age'] > 30
condition2 = df['Name'].str.contains('a')
filtered_df = df[condition1 & condition2]
This approach can make your code easier to follow and tweak.
Sometimes you're not filtering with a condition like “greater than” or “equal to”—you just want to pull out a specific list of values. That’s where isin() comes in handy.
python
CopyEdit
# Filter for names that are either Alice or Charlie
df[df['Name'].isin(['Alice', 'Charlie'])]
This is a common case when working with categorical data or filtering based on labels. You can use ~ before the statement to do the opposite:
python
CopyEdit
# Exclude Alice and Charlie
df[~df['Name'].isin(['Alice', 'Charlie'])]
It’s compact, readable, and works well with long lists of values.
There are times when your filtering logic can’t be neatly expressed with a single condition or value list. You may want to apply more flexible checks, and for that, .apply() or .map() comes in.
python
CopyEdit
# Define a custom function
def age_group(age):
return age >= 30 and age <= 35
# Filter using apply
df[df['Age'].apply(age_group)]
You can also use lambdas to keep things shorter:
python
CopyEdit
df[df['Name'].apply(lambda x: x.startswith('C'))]
For columns that contain complex data or when your filter logic needs a few lines to explain, writing a function and using apply() can save you from making a mess with chained conditions.
query() gives you a way to filter using a string expression. It’s often cleaner, especially if your column names are simple.
python
CopyEdit
# Using query for readability
df.query('Age > 30 and Name.str.startswith("D")', engine='python')
This syntax is easier to write and looks more like SQL. It’s especially useful when dealing with multi-line filters or working with notebooks where readability matters. Just be careful if your column names have spaces—you’ll need to use backticks around those names in your query.
Text columns often need to be filtered based on whether they include a keyword or match a pattern. pandas makes this flexible through string methods. Just add .str before the method, and it'll work on the whole column.
python
CopyEdit
# Names containing the letter 'a'
df[df['Name'].str.contains('a')]
# Names that end with 'e'
df[df['Name'].str.endswith('e')]
For pattern matching, you can include a regex. If you're working with structured text data—names, codes, IDs—this becomes useful very quickly.
Even though filtering usually works on column values, sometimes it’s simpler to filter by index—especially if your index carries useful information like dates, labels, or categories.
python
CopyEdit
# Set Name as index and filter by index
df_indexed = df.set_index('Name')
df_indexed.loc[['Alice', 'David']]
This works well when your index is already aligned with what you want to filter by. It’s less common in early data stages, but helpful later on when your data structure changes.
Once you start repeating certain filters across different datasets or projects, it makes sense to turn them into reusable functions. This doesn’t just save time—it also keeps your workflow consistent.
python
CopyEdit
def filter_by_age(df, min_age, max_age):
return df[(df['Age'] >= min_age) & (df['Age'] <= max_age)]
# Use the function
filtered = filter_by_age(df, 28, 36)
This kind of structure is useful when sharing code with others or cleaning up long notebooks. You don’t have to remember or rewrite filters every time.
Filtering rows in a pandas DataFrame by column values is one of those things that seems simple, but quickly grows into many variations depending on what kind of data you’re working with. Conditions, lists of values, custom logic, and even string patterns all come into play. Whether you’re preparing data for charts, slicing it for analysis, or just narrowing things down to see what's going on, these techniques are the backbone of that process. If you're familiar with a few of them, switching between methods depending on the situation becomes second nature.
Advertisement
Discover how a steel producer uses AI to cut costs, improve quality, boost efficiency, and reduce downtime in manufacturing
Explore the modern AI evolution timeline and decade of AI technology progress, highlighting rapid AI development milestones
Discover simple ways to avoid overfitting in machine learning and build models that perform well on real, unseen data every time
Learn the basics of Physical AI, how it's different from traditional AI, and why it's the future of smart machines.
Discover how deep learning and neural networks reshape business with smarter decisions, efficiency, innovation, and more
A venture capital firm announces funding initiatives to support early-stage startups building innovative AI tools.
Learn about fintech’s AI challenges: explainability gaps, synthetic identity fraud, compliance requirements, and others.
Curious about OLA Krutrim? Learn how to use this AI tool for writing, summarizing, and translating in multiple Indian languages with ease
Want to launch surveys quickly? Learn how Survicate lets you create and customize surveys with ease, collecting valuable customer feedback without hassle
What if your AI coding partner actually understood your project? See how Meta’s Code Llama 70B helps developers write smarter, cleaner, and more reliable code
Explore core concepts of artificial neural network modeling and know how neural networks in AI systems power real‑world solutions
Are you curious about how AI models can pick up new tasks with just a little training? Check out this beginner-friendly guide to learn how few-shot learning makes it possible.