Filtering data in R

vindianadoan

vindianadoan

Posted on February 27, 2023

Filtering data in R

Base R

  • subset() function:
subset(df, column_name > value)
Enter fullscreen mode Exit fullscreen mode
  • subset operator:
filtered_df <- df[df$column_name > 5, ]
Enter fullscreen mode Exit fullscreen mode
  • indices in subset operator:
filtered_df <- df[indices, ]
Enter fullscreen mode Exit fullscreen mode
  • which() function to extract indices:
# create a data frame
df <- data.frame(name = c("Alice", "Bob", "Charlie", "David"),
                 age = c(25, 35, 40, 30))

# use which() to extract indices of rows where age > 30
indices <- which(df$age > 30)

# use the indices to extract the subset of rows
subset_df <- df[indices, ]

# print the resulting data frame
subset_df
Enter fullscreen mode Exit fullscreen mode

Tidyverse/dplyr

  • filter() function: filter() function filters rows based on certain conditions.
subset_df <- filter(df, column_name > 30)
subset_df <- filter(df, column1_name > 30 
                  & column2_name == TRUE)
Enter fullscreen mode Exit fullscreen mode
  • slice() function: slice() function selects specific rows.
subset_df <- slice(df, indices)
subset_df <- slice(df, c(1,3))
Enter fullscreen mode Exit fullscreen mode
  • select() function: select() function selects specific columns.
subset_df <- select(df, column_name)
Enter fullscreen mode Exit fullscreen mode
💖 💪 🙅 🚩
vindianadoan
vindianadoan

Posted on February 27, 2023

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related