Perl one-liners introduction
Sundeep
Posted on November 9, 2020
Cover image generated using carbon
This post will give an overview of perl
syntax for command line usage and some examples to show what kind of problems are typically suited for one-liners.
Why use Perl for one-liners?
I assume you are already familiar with use cases where command line is more productive compared to GUI. See also this series of articles titled Unix as IDE.
A shell utility like bash
provides built-in commands and scripting features to easily solve and automate various tasks. External *nix commands like grep
, sed
, awk
, sort
, find
, parallel
, etc can be combined to work with each other. Depending upon your familiarity with those tools, you can either use perl
as a single replacement or complement them for specific use cases.
Here's some one-liners (options will be explained later):
-
perl -pe 's/(?:\x27;\x27|";")(*SKIP)(*F)|;/#/g'
— change;
to#
but don't change;
within single or double quotes -
perl -MList::Util=uniq -e 'print uniq <>'
— retain only first copy of duplicated lines, uses built-in moduleList::Util
-
perl -MRegexp::Common=net -nE 'say join "\n", //g if /$RE{net}{IPv4}/'
— extract only IPv4 addresses, using a third-party Regexp::Common module - Some stackoverflow Q&A that I've answered over the years with simpler
perl
solution compared to other cli tools
The selling point of perl
over tools like grep
, sed
and awk
includes feature rich regular expression engine and standard/third-party modules. If you don't already know the syntax and idioms for sed
and awk
, learning command line options for perl
would be the easier option. Another advantage is that perl
is more portable compared to GNU, BSD and Mac implementations of cli tools. The main disadvantage is that perl
is likely to be slower for features that are supported out of the box by those tools.
See also unix.stackexchange: when to use grep, sed, awk, perl, etc
Command line options
perl -h
gives the list of all command line options, along with a brief description. See perldoc: perlrun for documentation on these command switches.
Option | Description |
---|---|
-0[octal] |
specify record separator (\0 , if no argument) |
-a |
autosplit mode with -n or -p (splits $_ into @F ) |
-C[number/list] |
enables the listed Unicode features |
-c |
check syntax only (runs BEGIN and CHECK blocks) |
-d[:debugger] |
run program under debugger |
-D[number/list] |
set debugging flags (argument is a bit mask or alphabets) |
-e program |
one line of program (several -e 's allowed, omit programfile) |
-E program |
like -e , but enables all optional features |
-f |
don't do $sitelib/sitecustomize.pl at startup |
-F/pattern/ |
split() pattern for -a switch (// 's are optional) |
-i[extension] |
edit <> files in place (makes backup if extension supplied) |
-Idirectory |
specify @INC/#include directory (several -I 's allowed) |
-l[octal] |
enable line ending processing, specifies line terminator |
-[mM][-]module |
execute use/no module... before executing program |
-n |
assume while (<>) { ... } loop around program |
-p |
assume loop like -n but print line also, like sed
|
-s |
enable rudimentary parsing for switches after programfile |
-S |
look for programfile using PATH environment variable |
-t |
enable tainting warnings |
-T |
enable tainting checks |
-u |
dump core after parsing program |
-U |
allow unsafe operations |
-v |
print version, patchlevel and license |
-V[:variable] |
print configuration summary (or a single Config.pm variable) |
-w |
enable many useful warnings |
-W |
enable all warnings |
-x[directory] |
ignore text before #!perl line (optionally cd to directory) |
-X |
disable all warnings |
This post will show examples with -e
, -l
, -n
, -p
and -a
options.
Executing Perl code
If you want to execute a perl
program file, one way is to pass the filename as argument to the perl
command.
$ echo 'print "Hello Perl\n"' > hello.pl
$ perl hello.pl
Hello Perl
For short programs, you can also directly pass the code as an argument to the -e
or -E
options. See perldoc: feature for details about the features enabled by the -E
option.
$ perl -e 'print "Hello Perl\n"'
Hello Perl
$ # multiple statements can be issued separated by ;
$ # -l option will be covered in detail later, appends \n to 'print' here
$ perl -le '$x=25; $y=12; print $x**$y'
59604644775390625
$ # or, use -E and 'say' instead of -l and 'print'
$ perl -E '$x=25; $y=12; say $x**$y'
59604644775390625
Filtering
perl
one-liners can be used for filtering lines matched by a regexp, similar to grep
, sed
and awk
. And similar to many command line utilities, perl
can accept input from both stdin
and file arguments.
$ # sample stdin data
$ printf 'gate\napple\nwhat\nkite\n'
gate
apple
what
kite
$ # print all lines containing 'at'
$ # same as: grep 'at' and sed -n '/at/p' and awk '/at/'
$ printf 'gate\napple\nwhat\nkite\n' | perl -ne 'print if /at/'
gate
what
$ # print all lines NOT containing 'e'
$ # same as: grep -v 'e' and sed -n '/e/!p' and awk '!/e/'
$ printf 'gate\napple\nwhat\nkite\n' | perl -ne 'print if !/e/'
what
By default, grep
, sed
and awk
will automatically loop over input content line by line (with \n
as the line distinguishing character). The -n
or -p
option will enable this feature for perl
.
As seen before, the -e
option accepts code as command line argument. Many shortcuts are available to reduce the amount of typing needed. In the above examples, a regular expression (defined by the pattern between a pair of forward slashes) has been used to filter the input. When the input string isn't specified, the test is performed against special variable $_
, which has the contents of the current input line here (the correct term would be input record). $_
is also the default argument for many functions like print
and say
. To summarize:
-
/REGEXP/FLAGS
is a shortcut for$_ =~ m/REGEXP/FLAGS
-
!/REGEXP/FLAGS
is a shortcut for$_ !~ m/REGEXP/FLAGS
See perldoc: match for help on
m
operator. See perldoc: special variables for documentation on$_
,$&
, etc.
Here's an example with file input instead of stdin
.
$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14
$ perl -nE 'say $& if /(?<!-)\d+$/' table.txt
42
14
$ # if the condition isn't required, capture groups can be used
$ perl -nE 'say /(\d+)$/' table.txt
42
7
14
Substitution
Use s
operator for search and replace requirements. By default, this operates on $_
when the input string isn't provided. For these examples, -p
option is used instead of -n
option, so that the value of $_
is automatically printed after processing each input line. See perldoc: search and replace for documentation and examples.
$ # for each input line, change only first ':' to '-'
$ # same as: sed 's/:/-/' and awk '{sub(/:/, "-")} 1'
$ printf '1:2:3:4\na:b:c:d\n' | perl -pe 's/:/-/'
1-2:3:4
a-b:c:d
$ # for each input line, change all ':' to '-'
$ # same as: sed 's/:/-/g' and awk '{gsub(/:/, "-")} 1'
$ printf '1:2:3:4\na:b:c:d\n' | perl -pe 's/:/-/g'
1-2-3-4
a-b-c-d
The
s
operator modifies the input string it is acting upon if the pattern matches. In addition, it will return number of substitutions made if successful, otherwise returns a false value (empty string or0
). You can user
flag to return string after substitution instead of in-place modification. See perldoc: perlretut for a tutorial on Perl regular expressions.
Field processing
Consider the sample input file shown below with fields separated by a single space character.
$ cat table.txt
brown bread mat hair 42
blue cake mug shirt -7
yellow banana window shoes 3.14
Here's some examples that is based on specific field rather than the entire line. The -a
option will cause the input line to be split based on whitespaces and the field contents can be accessed using @F
special array variable. Leading and trailing whitespaces will be suppressed, so there's no possibility of empty fields.
$ # print the second field of each input line
$ # same as: awk '{print $2}' table.txt
$ perl -lane 'print $F[1]' table.txt
bread
cake
banana
$ # print lines only if last field is a negative number
$ # same as: awk '$NF<0' table.txt
$ perl -lane 'print if $F[-1] < 0' table.txt
blue cake mug shirt -7
$ # change 'b' to 'B' only for the first field
$ # same as: awk '{gsub(/b/, "B", $1)} 1' table.txt
$ perl -lane '$F[0] =~ s/b/B/g; print "@F"' table.txt
Brown bread mat hair 42
Blue cake mug shirt -7
yellow banana window shoes 3.14
BEGIN and END
You can use a BEGIN{}
block when you need to execute something before input is read and a END{}
block to execute something after all of the input has been processed.
$ # same as: awk 'BEGIN{print "---"} 1; END{print "%%%"}'
$ seq 4 | perl -pE 'BEGIN{say "---"} END{say "%%%"}'
---
1
2
3
4
%%%
ENV hash
When it comes to automation and scripting, you'd often need to construct commands that can accept input from user, file, output of a shell command, etc. As mentioned before, this book assumes bash
as the shell being used. To access environment variables of the shell, you can use the special hash variable %ENV
with the name of the environment variable as a string key.
$ # existing environment variable
$ # output shown here is for my machine, would differ for you
$ perl -E 'say $ENV{HOME}'
/home/learnbyexample
$ perl -E 'say $ENV{SHELL}'
/bin/bash
$ # defined along with perl command
$ # note that the variable is placed before the shell command
$ word='hello' perl -E 'say $ENV{word}'
hello
$ # the input characters are preserved as is
$ ip='hi\nbye' perl -E 'say $ENV{ip}'
hi\nbye
Here's another example when a regexp is passed as an environment variable content.
$ cat word_anchors.txt
sub par
spar
apparent effort
two spare computers
cart part tart mart
$ # assume 'r' is a shell variable that has to be passed to the perl command
$ r='\Bpar\B'
$ rgx="$r" perl -ne 'print if /$ENV{rgx}/' word_anchors.txt
apparent effort
two spare computers
You can also make use of the -s
option to assign a perl
variable.
$ r='\Bpar\B'
$ perl -sne 'print if /$rgx/' -- -rgx="$r" word_anchors.txt
apparent effort
two spare computers
As an example, see my repo ch: command help for a practical shell script, where commands are constructed dynamically.
Executing external commands
You can execute external commands using the system
function. See perldoc: system for documentation and details like how string/list argument is processed before it is executed.
$ perl -e 'system("echo Hello World")'
Hello World
$ perl -e 'system("wc -w <word_anchors.txt")'
12
$ perl -e 'system("seq -s, 10 > out.txt")'
$ cat out.txt
1,2,3,4,5,6,7,8,9,10
Return value of system
or special variable $?
can be used to act upon exit status of command issued. As per documentation:
The return value is the exit status of the program as returned by the
wait
call. To get the actual exit value, shift right by eight
$ perl -E '$es=system("ls word_anchors.txt"); say $es'
word_anchors.txt
0
$ perl -E 'system("ls word_anchors.txt"); say $?'
word_anchors.txt
0
$ perl -E 'system("ls xyz.txt"); say $?'
ls: cannot access 'xyz.txt': No such file or directory
512
To save the result of an external command, use backticks or qx
operator. See perldoc: qx for documentation and details like separating out STDOUT
and STDERR
.
$ perl -e '$words = `wc -w <word_anchors.txt`; print $words'
12
$ perl -e '$nums = qx/seq 3/; print $nums'
1
2
3
See also stackoverflow: difference between backticks, system, and exec
Summary
This post introduced some of the common options for perl
cli usage, along with typical cli text processing examples. While specific purpose cli tools like grep
, sed
and awk
are usually faster, perl
has a much more extensive standard library and ecosystem. And you do not have to learn a lot if you are already comfortable with perl
but not familiar with those cli tools.
Perl one-liners cookbook
If you liked this post and would like to learn more, check out my ebook using the links below.
You can also get the ebooks as part of Magical one-liners bundle using these links:
Posted on November 9, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.