C# Search by multiple strings
Karen Payne
Posted on November 17, 2024
Introduction
Usually when there is a need to determine if a string has multiple tokens/words a developer uses code like the following.
public static class Extensions
{
public static bool Search(this string line) =>
line.IndexOf("hello", StringComparison.OrdinalIgnoreCase) > 1 &&
line.IndexOf("world", StringComparison.OrdinalIgnoreCase) > 1;
}
Starting with .NET Core 8, Microsoft provides
System.Buffers.SearchValues<T> Class
Learn about SearchValues.
Which to use IndexOf or SearchValues?
SearchValues is a powerful structure that improves the efficiency of search operations. Providing a dedicated and optimized method for lookups, helps you write more performant and cleaner code, especially in scenarios where checking for multiple values is frequent.
SearchValues is not a replacement for IndexOf or IndexOfAny, SearchValues over larger strings which means for smaller strings a developer can use IndexOf && IndexOf etc.
Examples for SearchValues
Does text contain spam?
The text can come from any source, in this case to keep things simple, a text file.
TextBanned.txt
Hello Karen, I am writing to inform you that your account is now active.
This is not a spam message. Please click the link below
In a project a json file is used for watched tokens/words.
bannedwords.json
[
{
"Id": "1",
"Name": "spam"
},
{
"Id": "2",
"Name": "advertisement"
},
{
"Id": "3",
"Name": "clickbait"
}
]
The following model is used to deserialize the file above.
public class BannedWord
{
public string Id { get; set; }
public string Name { get; set; }
}
Next, create a language extension method for SearchValues.
public static class GenericExtensions
{
/// <summary>
/// Determines whether the specified text contains any of the banned words.
/// </summary>
/// <param name="text">The text to be checked for banned words.</param>
/// <param name="bannedWords">An array of banned words to search for within the text.</param>
/// <returns>
/// <c>true</c> if the text contains any of the banned words; otherwise, <c>false</c>.
/// </returns>
public static bool HasBannedWords(this string text, params string[] bannedWords) =>
text.AsSpan().ContainsAny(SearchValues.Create(bannedWords, StringComparison.OrdinalIgnoreCase));
}
Note
The above extension method is case-insensitive, either logic and a bool passed in to determine if the search is case-insensitive or not or create an overloaded method for matching case.
The following code first reads words/tokens to search for by deserializing bannedwords.json
followed by reading the file TestBanneded.txt
which is the file to scan for spam.
Note the foreach statement uses Enumerable.Index which is in .NET Core 9 which allows deconstruction to the current index (zero based) and the item, where item is a line for the variable sentences.
Debug.WriteLine is used below as the source code was done in a Windows Forms project where Console.WriteLine does not work.
Find errors/warning in Visual Studio log file
When Visual Studio encounters errors they can be written to a log file by starting Visual Studio with the following command.
Open the ActivityLog.xml by clicking on the file usually has thousands of lines and can be tedious to find errors/warnings.
Small look at ActivityLog.xml.
The following extension methods, first and second are using SearchValues were for the following code sample the second will be used as we are only interested in errors and warnings. The first extension method would be used for general purpose searches. The last extension method is the conventional approach which is less flexible.
public static class Extensions
{
/// <summary>
/// Searches the specified string for any of the provided tokens case-insensitive.
/// </summary>
/// <param name="sender">The string to search within.</param>
/// <param name="tokens">An array of tokens to search for within the string.</param>
/// <returns>
/// <c>true</c> if any of the tokens are found within the string; otherwise, <c>false</c>.
/// </returns>
public static bool Search(this string sender, string[] tokens)
=> sender.AsSpan().ContainsAny(
SearchValues.Create(tokens,
StringComparison.OrdinalIgnoreCase));
/// <summary>
/// Determines whether the specified line contains a warning or error.
/// </summary>
/// <param name="line">The line of text to be checked for warnings or errors.</param>
/// <returns>
/// <c>true</c> if the line contains a warning or error; otherwise, <c>false</c>.
/// </returns>
public static bool LineHasWarningOrError(this string line)
{
ReadOnlySpan<string> tokens = ["<type>Error</type>", "<type>Warning</type>"];
return line.AsSpan().ContainsAny(SearchValues.Create(tokens, StringComparison.OrdinalIgnoreCase));
}
/// <summary>
/// Determines whether the specified line contains a warning or error using conventional string comparison.
/// </summary>
/// <param name="line">The line of text to be checked for warnings or errors.</param>
/// <returns>
/// <c>true</c> if the line contains a warning or error; otherwise, <c>false</c>.
/// </returns>
public static bool LineHasWarningOrErrorConventional(this string line) =>
line.IndexOf("<type>Error</type>", StringComparison.OrdinalIgnoreCase) > 1 &&
line.IndexOf("<type>Warning</type>", StringComparison.OrdinalIgnoreCase) > 1;
}
Executing code (full source is provided).
- First determine if the activity file exists, if so read it.
- Display the path and file name along with line count
- Iterate each line searching for errors and warnings.
Extra
Finding the activity log is not easy and that there may be multiples. To assist with finding the right activity log the provided source code has a class dedicated to working with the activity file which includes providing the path to the activity file which can be helpful for developers who want to examine older activity files.
Source code
Both point to two different GitHub repositories. For the Spam Source code check out new NET Core 9 features.
Spam Source code Activity log Source code
Summary
SearchValues provides a new method to search for words/tokens in a string which is better performing than IndexOf for larger strings and that SearchValues is more flexible than IndexOf.
Posted on November 17, 2024
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.