Advertisement
Working with data often means sifting through a large set of numbers, names, or categories to focus on what matters. In pandas, that’s where filtering by column values comes in. It lets you narrow your DataFrame to include just the rows you want—nothing more, nothing less. Whether you’re cleaning up a dataset, selecting a group for analysis, or preparing data for visualization, filtering is a regular task. Thankfully, pandas make it pretty smooth.
Let’s go over some of the practical ways to filter rows based on what’s in a column. You’ll see how to use conditions, match values, apply functions, and more.
Perhaps the most straightforward way to filter is through the use of conditions. If you've used pandas in the past, you've likely already encountered this technique. It's popular because it simply works, and it's easy to understand.
python
CopyEdit
import pandas as pd
df = pd.DataFrame({
'Name': ['Alice', 'Bob', 'Charlie', 'David'],
'Age': [25, 30, 35, 40]
})
# Filter rows where Age is greater than 30
filtered_df = df[df['Age'] > 30]
This returns you all the rows in which the 'Age' column has a value greater than 30. You can replace other comparison operators as well—less than, equals, not equals, etc. The same principle can be used with strings:
python
CopyEdit
# Names that are not 'Bob'
df[df['Name'] != 'Bob']
These are basic conditionals, and you can chain them using logical operators if you need more than one condition.
If you wish to apply more than one filter, you'll have to use & for AND, or for OR, and enclose each condition in parentheses. pandas does not recognize regular Python's and or or within DataFrame filtering.
python
CopyEdit
# Age over 30 AND name starts with 'D'
df[(df['Age'] > 30) & (df['Name'].str.startswith('D'))]
This makes it easy to layer on multiple filters without writing a loop or separate functions. Just be sure to use parentheses, or pandas will throw an error.
If you’re working with many filters, or they change often, you might find it cleaner to build them in separate lines and combine later:
python
CopyEdit
condition1 = df['Age'] > 30
condition2 = df['Name'].str.contains('a')
filtered_df = df[condition1 & condition2]
This approach can make your code easier to follow and tweak.
Sometimes you're not filtering with a condition like “greater than” or “equal to”—you just want to pull out a specific list of values. That’s where isin() comes in handy.
python
CopyEdit
# Filter for names that are either Alice or Charlie
df[df['Name'].isin(['Alice', 'Charlie'])]
This is a common case when working with categorical data or filtering based on labels. You can use ~ before the statement to do the opposite:
python
CopyEdit
# Exclude Alice and Charlie
df[~df['Name'].isin(['Alice', 'Charlie'])]
It’s compact, readable, and works well with long lists of values.
There are times when your filtering logic can’t be neatly expressed with a single condition or value list. You may want to apply more flexible checks, and for that, .apply() or .map() comes in.
python
CopyEdit
# Define a custom function
def age_group(age):
return age >= 30 and age <= 35
# Filter using apply
df[df['Age'].apply(age_group)]
You can also use lambdas to keep things shorter:
python
CopyEdit
df[df['Name'].apply(lambda x: x.startswith('C'))]
For columns that contain complex data or when your filter logic needs a few lines to explain, writing a function and using apply() can save you from making a mess with chained conditions.
query() gives you a way to filter using a string expression. It’s often cleaner, especially if your column names are simple.
python
CopyEdit
# Using query for readability
df.query('Age > 30 and Name.str.startswith("D")', engine='python')
This syntax is easier to write and looks more like SQL. It’s especially useful when dealing with multi-line filters or working with notebooks where readability matters. Just be careful if your column names have spaces—you’ll need to use backticks around those names in your query.
Text columns often need to be filtered based on whether they include a keyword or match a pattern. pandas makes this flexible through string methods. Just add .str before the method, and it'll work on the whole column.
python
CopyEdit
# Names containing the letter 'a'
df[df['Name'].str.contains('a')]
# Names that end with 'e'
df[df['Name'].str.endswith('e')]
For pattern matching, you can include a regex. If you're working with structured text data—names, codes, IDs—this becomes useful very quickly.
Even though filtering usually works on column values, sometimes it’s simpler to filter by index—especially if your index carries useful information like dates, labels, or categories.
python
CopyEdit
# Set Name as index and filter by index
df_indexed = df.set_index('Name')
df_indexed.loc[['Alice', 'David']]
This works well when your index is already aligned with what you want to filter by. It’s less common in early data stages, but helpful later on when your data structure changes.
Once you start repeating certain filters across different datasets or projects, it makes sense to turn them into reusable functions. This doesn’t just save time—it also keeps your workflow consistent.
python
CopyEdit
def filter_by_age(df, min_age, max_age):
return df[(df['Age'] >= min_age) & (df['Age'] <= max_age)]
# Use the function
filtered = filter_by_age(df, 28, 36)
This kind of structure is useful when sharing code with others or cleaning up long notebooks. You don’t have to remember or rewrite filters every time.
Filtering rows in a pandas DataFrame by column values is one of those things that seems simple, but quickly grows into many variations depending on what kind of data you’re working with. Conditions, lists of values, custom logic, and even string patterns all come into play. Whether you’re preparing data for charts, slicing it for analysis, or just narrowing things down to see what's going on, these techniques are the backbone of that process. If you're familiar with a few of them, switching between methods depending on the situation becomes second nature.
Advertisement
Curious about Arc Search? Learn how this AI-powered browser is reshaping mobile browsing with personalized, faster, and smarter experiences on your iPhone
A venture capital firm announces funding initiatives to support early-stage startups building innovative AI tools.
Lenovo is transforming industries by bringing generative AI to the edge with powerful hardware and real-time solutions.
Boost your productivity with these top 10 ChatGPT plugins in 2025. From task management to quick research, discover plugins that save time and streamline your work
Learn the basics of Physical AI, how it's different from traditional AI, and why it's the future of smart machines.
Need to filter your DataFrame without writing complex code? Learn how pandas lets you pick the rows you want using simple, flexible techniques
Looking to edit music like a pro without being a composer? Discover Adobe’s Project Music GenAI Control, a tool that lets you create and tweak music tracks with simple commands and no re-rendering required
Need smarter workflows in Google Sheets? Learn how to use GPT for Sheets and Docs to write, edit, summarize, and automate text with simple AI formulas
Explore the modern AI evolution timeline and decade of AI technology progress, highlighting rapid AI development milestones
Understand how mixture-of-experts models work and why they're critical to the future of scalable AI systems.
Explore the features, key benefits, and real-world use cases of ERP AI chatbots transforming modern enterprise workflows.
Curious about OLA Krutrim? Learn how to use this AI tool for writing, summarizing, and translating in multiple Indian languages with ease