Processing Text with Linux Shell - Part 1

shamil

Shamil

Posted on July 27, 2018

Processing Text with Linux Shell - Part 1

Into the world of sed

If you are using any *nix systems on a daily basis, chances are you are already familiar with, or at least you have heard about the sed command.

sed , short for Stream Editor, is a text transformation tool that comes bundled with every unix system. What makes sed distinguishable from other text editors is the speed at which the text manipulation is performed. sed only makes one pass over the input text, therefore making the processing quite faster.

# Replace those ugly text

sed is a very powerful tool to replace a piece of text with another. The text can be matched using regular expressions.

sed 's/text_to_be_replaced/replacement_text/' file_name

However, this will only print the substitued text in the console, but won't change the same in the file itself. If we want to save the changes to the file, we can use the -i flag.

sed -i 's/text_to_be_replaced/replacement_text/' file_name

This above replaces only the first occurance of the given pattern in each line. So if we want to replace every occurence of the pattern, we can append the g parameter to the end.

sed 's/text_to_be_replaced/replacement_text/g' file_name

Note that the delimiter character / we used in the above commands is not fixed, we can use almost any delimiter character in sed. For example,

sed 's:text_to_be_replaced:replacement_text:g' file_name


sed 's|text_to_be_replaced|replacement_text|g' file_name

Okay, but what if the delimiter character is itself a part of the pattern to be replaced? ¿ⓧ_ⓧﮌ

Well, we can escape that character with a backslash. For example, to replace the word following: with below - , we can do this:

sed 's:following\::below - :' file_name

Notice the use of \: before the delimiter : that separates the pattern and it's replacement.

# Delete that scrap

sed also allows us to delete lines from a file. The d option is used to indicate a delete operation. The generic syntax to delete line is

sed 'Nd' file_name

Here N is the line number that we want to delete. If we want to delete the 10th line from a file, N would be 10.

One most common use of this command is deleting all blank lines in a file.

sed '/^$/d' file_name

The above will delete all the blank lines in the file. The regular expression ^$ marks an empty line and the d option specifies that the line should be deleted.

That's not it. We can also specify a range of lines that should be deleted.

sed 'm,nd' file_name

The above command will delete all the lines starting from mth upto nth.

# Pipelining is important

Now what about pipelining multiple sed commands?

We can pipeline as many sed as we wish and they would be processed in that order. Consider the following example.

echo Linux | sed 's/L/l/' | sed 's/n/N/' | sed 's/l/L/' | sed 's/x/X/'

This will output LiNuX.

Finally let's take a look at how we can use variables within sed command. So far we have used ' ' (single quote) in our commands. However we can aslo use " " when we need to use an expression in our command. Take a look at the following example.

greet=hello

echo hello shamil | sed "s/$greet/hi" file_name

This will replace evaluate the value of $greet and and replace hello with hi.

# Better safe than sorry

When using -i in the sed command, we need to be careful, as it replaces the actual content in the file. (Trust me, I have done this many times)

Therefore, it is a good practice to first use this command without -i flag and check if the replacements are correct. However, if the file contents are too long to be checked like that, you can use the following command to create a backup copy of the same and then modifying the content.

sed -i.bak '12,30d' file_name

This will delete all lines from 12 to 30, but most importantly it will create a file_name.bak in the same directory before modifying the actual file.

Who knows, this might just end up saving your job (◠﹏◠)

(EDIT: See this comment for more info on -i usages)

💖 💪 🙅 🚩
shamil
Shamil

Posted on July 27, 2018

Join Our Newsletter. No Spam, Only the good stuff.

Sign up to receive the latest update from our blog.

Related