Keith Bennett
Posted on April 15, 2019
[Caution: This is a long article!
Sections are mostly self contained, so feel free to skim or skip them.]
Rexe is a Ruby script and gem that multiplies Ruby's usefulness and conciseness on the command line by:
- automating parsing and formatting using JSON, YAML, Ruby marshalling, Awesome Print, and others
- simplifying the use of Ruby as a shell filter, optionally predigesting input as lines, an enumerator, or one big string
- extracting the plumbing from the command line; requires and other options can be set in an environment variable
- enabling the loading of Ruby helper files to keep your code DRY and your command line code high level
- reading and evaluating a ~/.rexerc file on startup for your shared custom code and common requires
Shell scripting is great for simple tasks but for anything nontrivial it can easily get cryptic and awkward (pun intended!).
This problem can often be solved by writing a Ruby script instead. Ruby provides fine grained control in a language that is all about clarity, conciseness, and expressiveness.
Unfortunately, when there are multiple OS commands to be called, then Ruby can be awkward too.
Sometimes a good solution is to combine Ruby and shell scripting on the same command line. Rexe multiplies your power to do so.
Using the Ruby Interpreter on the Command Line
Let's start by seeing what the Ruby interpreter already provides. Here we use ruby
on the command line, using an intermediate environment variable to simplify the logic and save the data for use by future commands. An excerpt of the output follows the code:
➜ ~ export EUR_RATES_JSON=`curl https://api.exchangeratesapi.io/latest`
➜ ~ echo $EUR_RATES_JSON | ruby -r json -r yaml -e 'puts JSON.parse(STDIN.read).to_yaml'
---
rates:
MXN: 21.96
AUD: 1.5964
HKD: 8.8092
...
base: EUR
date: '2019-03-08'
Unfortunately, the configuration setup (the require
s) along with the reading, parsing, and formatting make the command long and tedious, discouraging this approach.
Rexe
Rexe [see footnote ^1 regarding its origin] can simplify such commands. Among other things, rexe provides switch-activated input parsing and output formatting so that converting from one format to another is trivial. The previous ruby
command can be expressed in rexe
as:
➜ ~ echo $EUR_RATES_JSON | rexe -mb -ij -oy self
Or, even more concisely (self
is the default Ruby source code for rexe commands):
➜ ~ echo $EUR_RATES_JSON | rexe -mb -ij -oy
The command options may seem cryptic, but they're logical so it shouldn't take long to learn them:
-
-mb
- mode to consume all standard input as a single big string -
-ij
- parse that input with JSON;self
will be the parsed object -
-oy
- output the final value as YAML
If input comes from a JSON or YAML file, rexe determines the input format from the file's extension, and it's even simpler:
➜ ~ rexe -f eur_rates.json -oy
Rexe is at https://github.com/keithrbennett/rexe and can be installed with gem install rexe
. Rexe provides several ways to simplify Ruby on the command line, tipping the scale so that it is practical to do it more often.
Here is rexe's help text as of the time of this writing:
rexe -- Ruby Command Line Executor/Filter -- v1.0.1 -- https://github.com/keithrbennett/rexe
Executes Ruby code on the command line,
optionally automating management of standard input and standard output,
and optionally parsing input and formatting output with YAML, JSON, etc.
rexe [options] [Ruby source code]
Options:
-c --clear_options Clear all previous command line options specified up to now
-f --input_file Use this file instead of stdin for preprocessed input;
if filespec has a YAML and JSON file extension,
sets input format accordingly and sets input mode to -mb
-g --log_format FORMAT Log format, logs to stderr, defaults to none
(see -o for format options)
-h, --help Print help and exit
-i, --input_format FORMAT Input format
-ij JSON
-im Marshal
-in None (default)
-iy YAML
-l, --load RUBY_FILE(S) Ruby file(s) to load, comma separated;
! to clear all, or precede a name with '-' to remove
-m, --input_mode MODE Input preprocessing mode (determines what `self` will be):
-ml line; each line is ingested as a separate string
-me enumerator (each_line on STDIN or File)
-mb big string; all lines combined into one string
-mn none (default); no input preprocessing;
self is an Object.new
-n, --[no-]noop Do not execute the code (useful with -g);
For true: yes, true, y, +; for false: no, false, n
-o, --output_format FORMAT Output format (defaults to puts):
-oi Inspect
-oj JSON
-oJ Pretty JSON
-om Marshal
-on No Output (default)
-op Puts
-os to_s
-oy YAML
-r, --require REQUIRE(S) Gems and built-in libraries to require, comma separated;
! to clear all, or precede a name with '-' to remove
---------------------------------------------------------------------------------------
In many cases you will need to enclose your source code in single or double quotes.
If source code is not specified, it will default to 'self',
which is most likely useful only in a filter mode (-ml, -me, -mb).
If there is a .rexerc file in your home directory, it will be run as Ruby code
before processing the input.
If there is a REXE_OPTIONS environment variable, its content will be prepended
to the command line so that you can specify options implicitly
(e.g. `export REXE_OPTIONS="-r awesome_print,yaml"`)
Simplifying the Rexe Invocation
There are two main ways we can make the rexe command line even more concise:
- by extracting configuration into the
REXE_OPTIONS
environment variable - by extracting low level and/or shared code into helper files that are loaded using
-l
, or implicitly with~/.rexerc
The REXE_OPTIONS Environment Variable
The REXE_OPTIONS
environment variable can contain command line options that would otherwise be specified on the rexe command line:
Instead of this:
➜ ~ rexe -r wifi-wand -oa WifiWand::MacOsModel.new.wifi_info
you can do this:
➜ ~ export REXE_OPTIONS="-r wifi-wand -oa"
➜ ~ rexe WifiWand::MacOsModel.new.wifi_info
➜ ~ # [more rexe commands with the same options]
Putting configuration options in REXE_OPTIONS
effectively creates custom defaults, and is useful when you use the same options in most or all of your commands. Any options specified on the rexe command line will override the environment variable options.
Like any environment variable, REXE_OPTIONS
could also be set in your startup script, input on a command line using export
, or in another script loaded with source
or .
.
Loading Files
The environment variable approach works well for command line options, but what if we want to specify Ruby code (e.g. methods) that can be used by your rexe code?
For this, rexe lets you load Ruby files, using the -l
option, or implicitly (without your specifying it) in the case of the ~/.rexerc
file. Here is an example of something you might include in such a file:
# Open YouTube to Wagner's "Ride of the Valkyries"
def valkyries
`open "http://www.youtube.com/watch?v=P73Z6291Pt8&t=0m28s"`
end
To digress a bit, why would you want this? You might want to be able to go to another room until a long job completes, and be notified when it is done. The valkyries
method will launch a browser window pointed to Richard Wagner's "Ride of the Valkyries" starting at a lively point in the music [see footnote ^2 regarding autoplay]. (The open
command is Mac specific and could be replaced with start
on Windows, a browser command name, etc.) [see footnote ^3 regarding OS portability].
If you like this kind of audio notification, you could download public domain audio files and use a command like player like afplay
on Mac OS, or mpg123
or ogg123
on Linux. This approach is lighter weight, requires no network access, and will not leave an open browser window for you to close.
Here is an example of how you might use the valkyries
method, assuming the above configuration is loaded from your ~/.rexerc
file or an explicitly loaded file:
➜ ~ tar czf /tmp/my-whole-user-space.tar.gz ~ ; rexe valkyries
(Note that ;
is used rather than &&
because we want to hear the music whether or not the command succeeds.)
You might be thinking that creating an alias or a minimal shell script (instead of a Ruby script) for this open
would be a simpler and more natural approach, and I would agree with you. However, over time the number of these could become unmanageable, whereas using Ruby you could build a pretty extensive and well organized library of functionality. Moreover, that functionality could be made available to all your Ruby code (for example, by putting it in a gem), and not just command line one liners.
For example, you could have something like this in a gem or loaded file:
def play(piece_code)
pieces = {
hallelujah: "https://www.youtube.com/watch?v=IUZEtVbJT5c&t=0m20s",
valkyries: "http://www.youtube.com/watch?v=P73Z6291Pt8&t=0m28s",
wm_tell: "https://www.youtube.com/watch?v=j3T8-aeOrbg&t=0m1s",
# ... and many, many more
}
`open #{Shellwords.escape(pieces.fetch(piece_code))}`
end
...which you could then call like this:
➜ ~ tar czf /tmp/my-whole-user-space.tar.gz ~ ; rexe 'play(:hallelujah)'
(You need to quote the play
call because otherwise the shell will process and remove the parentheses. Alternatively you could escape the parentheses with backslashes.)
One of the examples at the end of this articles shows how you could have different music play for success and failure.
Logging
A log entry is optionally output to standard error after completion of the code. This entry is a hash representation (to be precise, to_h
) of the $RC
OpenStruct described in the $RC section below. It contains the version, date/time of execution, source code to be evaluated, options (after parsing both the REXE_OPTIONS
environment variable and the command line), and the execution time of your Ruby code:
➜ ~ echo $EUR_RATES_JSON | rexe -gy -ij -mb -oa -n self
---
:count: 0
:rexe_version: 1.0.0
:start_time: '2019-04-15T13:12:15+08:00'
:source_code: self
:options:
:input_filespec:
:input_format: :json
:input_mode: :one_big_string
:loads: []
:output_format: :awesome_print
:requires:
- awesome_print
- json
- yaml
:log_format: :yaml
:noop: true
:duration_secs: 0.050326
We specified -gy
for YAML format; there are other formats as well (see the help output or this document) and the default is -gn
, which means don't output the log entry at all.
The requires you see were not explicitly specified but were automatically added because Rexe will add any requires needed for automatic parsing and formatting, and we specified those formats in the command line options -gy -ij -oa
.
This extra output is sent to standard error (stderr) instead of standard output (stdout) so that it will not pollute the "real" data when stdout is piped to another command.
If you would like to append this informational output to a file(e.g. rexe.log
), you could do something like this:
➜ ~ rexe ... -gy 2>>rexe.log
Input Modes
Rexe tries to make it simple and convenient for you to handle standard input, and in different ways. Here is the help text relating to input modes:
-m, --input_mode MODE Input preprocessing mode (determines what `self` will be):
-ml line; each line is ingested as a separate string
-me enumerator (each_line on STDIN or File)
-mb big string; all lines combined into one string
-mn none (default); no input preprocessing;
self is an Object.new
The first three are filter modes; they make standard input available to your code as self
.
The last (and default) is the executor mode. It merely assists you in executing the code you provide without any special implicit handling of standard input. Here is more detail on these modes:
-ml "Line" Filter Mode
In this mode, your code would be called once per line of input, and in each call, self
would evaluate to each line of text:
➜ ~ echo "hello\ngoodbye" | rexe -ml puts reverse
olleh
eybdoog
reverse
is implicitly called on each line of standard input. self
is the input line in each call (we could also have used self.reverse
but the self.
would have been redundant).
Be aware that, in this mode, if you are using an automatic output mode (anything other than the default -on
no output mode), although you can control the content of output records, there is no way to selectively exclude records from being output. Even if the result of the code is nil or the empty string, a newline will be output. To prevent this, you can do one of the following:
- use
-me
Enumerator mode instead and callselect
,filter
,reject
, etc. - use the (default)
-on
no output mode and callputs
explicitly for the output you do want
-me "Enumerator" Filter Mode
In this mode, your code is called only once, and self
is an enumerator dispensing all lines of standard input. To be more precise, it is the enumerator returned by the each_line
method, on $stdin
or the input file, whichever is applicable.
Dealing with input as an enumerator enables you to use the wealth of Enumerable
methods such as select
, to_a
, map
, etc.
Here is an example of using -me
to add line numbers to the first 3 files in the directory listing:
➜ ~ ls / | rexe -me "first(3).each_with_index { |ln,i| puts '%5d %s' % [i, ln] }"
0 AndroidStudioProjects
1 Applications
2 Desktop
Since self
is an enumerable, we can call first
on it. We've used the default output mode -on
(no output mode), which says don't do any automatic output, just the output explicitly specified by puts
in the source code.
-mb "Big String" Filter Mode
In this mode, all standard input is combined into a single (possibly large and possibly multiline) string.
A good example of when you would use this is when you need to parse a multiline JSON or YAML representation of an object; you need to pass all the standard input to the parse method. This is the mode that was used in the first rexe example in this article.
-mn "No Input" Executor Mode -- The Default
In this mode, no special handling of standard input is done at all; if you want standard input you need to code it yourself (e.g. with STDIN.read
).
self
evaluates to a new instance of Object
, which would be used if you defined methods, constants, instance variables, etc., in your code.
Filter Input Mode Memory Considerations
If you are using one of the filter modes, and may have more input than would fit in memory, you can do one of the following:
- use
-ml
(line) mode so you are fed only 1 line at a time - use an Enumerator, either by a) specifying the
-me
(enumerator) mode option, or b) using-mn
(no input) mode in conjunction with something likeSTDIN.each_line
. Then:- Make sure not to call any methods (e.g.
map
,select
) that will produce an array of all the input because that will pull all the records into memory, or: - use lazy enumerators
- Make sure not to call any methods (e.g.
Input Formats
Rexe can parse your input in any of several formats if you like. You would request this in the input format (-i
) option. Legal values are:
-
-ij
- JSON -
-im
- Marshal -
-in
- None -
-iy
- YAML
Except for -in
, which passes the text to your code untouched, your input will be parsed in the specified format, and the resulting object passed into your code as self
.
The input format option is ignored if the input mode is -mn
("no input" executor mode, the default), since there is no preprocessing of standard input in that mode.
Output Formats
Several output formats are provided for your convenience:
-
-oa
- Awesome Print - calls.ai
on the object to get the string thatap
would print -
-oi
- Inspect - callsinspect
on the object -
-oj
- JSON - callsto_json
on the object -
-oJ
- Pretty JSON callsJSON.pretty_generate
with the object -
-on
- (default) No Output - output is suppressed -
-op
- Puts - produces whatputs
would output -
-os
- To String - callsto_s
on the object -
-oy
- YAML - callsto_yaml
on the object
All formats will implicitly require
anything needed to accomplish their task (e.g. require 'yaml'
).
The default is -on
to produce no output at all (unless explicitly coded to do so). If you prefer a different default such as -op
for puts mode, you can specify that in your REXE_OPTIONS
environment variable.
If two letters are provided, the first will be used for tty devices (e.g. the terminal when not redirected or piped), and the second for block devices (e.g. when redirected or piped to another process).
You may wonder why these formats are provided, given that their functionality could be included in the custom code instead. Here's why:
- The savings in command line length goes a long way to making these commands more readable and feasible.
- It's much simpler to switch formats, as there is no need to change the code itself.
- This approach enables parameterization of the output format.
Reading Input from a File
Rexe also simplifies getting input from a file rather than standard input. The -f
option takes a filespec and does with its content exactly what it would have done with standard input. This shortens:
➜ ~ cat filename.ext | rexe ...
...to...
➜ ~ rexe -f filename.ext ...
This becomes even more useful if you are using files whose extensions are .yml
, .yaml
, or .json
(case insensitively). In this case the input format and mode will be set automatically for you to:
-
-iy
(YAML) or-ij
(JSON) depending on the file extension -
-mb
(one big string mode), which assumes that the most common use case will be to parse the entire file at once
So the example we gave above:
➜ ~ export EUR_RATES_JSON=`curl https://api.exchangeratesapi.io/latest`
➜ ~ echo $EUR_RATES_JSON | rexe -mb -ij -oy self
...could be changed to:
➜ ~ curl https://api.exchangeratesapi.io/latest > eur_rates.json
➜ ~ rexe -f eur_rates.json -oy self
Another possible win for using -f
is that since it is a command line option, it could be specified in REXE_OPTIONS
. This could be useful if you are doing many operations on the same file.
If you need to override the input mode and format automatically configured for file input, you can simply specify the desired options on the command line after the -f
:
➜ ~ rexe -f eur_rates.json -mb -in 'puts self.class, self[0..20]'
String
{"base":"EUR","rates"
'self' as Default Source Code
To make rexe even more concise, you do not need to specify any source code when you want that source code to be self
. This would be the case for simple format conversions, as in JSON to YAML conversion mentioned above:
➜ ~ rexe -f eur_rates.json -oy
# or
➜ ~ echo $EUR_RATES_JSON | rexe -mb -ij -oy
---
rates:
JPY: 126.63
BRL: 4.3012
NOK: 9.6915
...
This feature is probably only useful in the filter modes, since in the executor mode (-mn
) self is a new instance of Object
and hardly ever useful as an output value.
The $RC Global OpenStruct
For your convenience, the information displayed in verbose mode is available to your code at runtime by accessing the $RC
global variable, which contains an OpenStruct. Let's print out its contents using YAML:
➜ ~ rexe -oy '$RC'
--- !ruby/object:OpenStruct
table:
:count: 0
:rexe_version: 1.0.0
:start_time: '2019-04-15T13:25:56+08:00'
:source_code: "$RC"
:options:
:input_filespec:
:input_format: :none
:input_mode: :none
:loads: []
:output_format: :yaml
:requires:
- yaml
:log_format: :none
:noop: false
modifiable: true
Probably most useful in that object at runtime is the record count, accessible with both $RC.count
and $RC.i
. This is only really useful in line mode, because in the others it will always be 0 or 1. Here is an example of how you might use it as a kind of progress indicator:
➜ ~ find / | rexe -ml -on \
'if $RC.i % 1000 == 0; puts %Q{File entry ##{$RC.i} is #{self}}; end'
...
File entry #106000 is /usr/local/Cellar/go/1.11.5/libexec/src/cmd/vendor/github.com/google/pprof/internal/driver/driver_test.go
File entry #107000 is /usr/local/Cellar/go/1.11.5/libexec/src/go/types/testdata/cycles1.src
File entry #108000 is /usr/local/Cellar/go/1.11.5/libexec/src/runtime/os_linux_novdso.go
...
Note that a single quote was used for the Ruby code here; if a double quote were used, the $RC
would have been interpreted and removed by the shell.
Implementing Domain Specific Languages (DSL's)
Defining methods in your loaded files enables you to effectively define a DSL for your command line use. You could use different load files for different projects, domains, or contexts, and define aliases or one line scripts to give them meaningful names. For example, if you had Ansible helper code in ~/projects/ansible-tools/rexe-ansible.rb
, you could define an alias in your startup script:
➜ ~ alias rxans="rexe -l ~/projects/ansible-tools/rexe-ansible.rb $*"
...and then you would have an Ansible DSL available for me to use by calling rxans
.
In addition, since you can also call pry
on the context of any object, you can provide a DSL in a REPL (shell) trivially easily. Just to illustrate, here's how you would open a REPL on the File class:
➜ ~ ruby -r pry -e File.pry
# or
➜ ~ rexe -r pry File.pry
self
would evaluate to the File
class, so you could call class methods using only their names:
➜ ~ rexe -r pry File.pry
[6] pry(File)> size '/etc/passwd'
6804
[7] pry(File)> directory? '.'
true
[8] pry(File)> file?('/etc/passwd')
true
This could be really handy if you call pry
on a custom object that has methods especially suited to your task:
➜ ~ rexe -r wifi-wand,pry WifiWand::MacOsModel.new.pry
[1] pry(#<WifiWand::MacOsModel>)> random_mac_address
"a1:ea:69:d9:ca:05"
[2] pry(#<WifiWand::MacOsModel>)> connected_network_name
"My WiFi"
Ruby is supremely well suited for DSL's since it does not require parentheses for method calls, so calls to your custom methods look like built in language commands and keywords.
Quotation Marks and Quoting Strings in Your Ruby Code
One complication of using utilities like rexe where Ruby code is specified on the command line is that you need to be careful about the shell's special treatment of certain characters. For this reason, it is often necessary to quote the Ruby code. You can use single or double quotes to have the shell treat your source code as a single argument. An excellent reference for how they differ is on StackOverflow at https://stackoverflow.com/questions/6697753/difference-between-single-and-double-quotes-in-bash.
Personally, I find single quotes more useful since I usually don't want special characters in my Ruby code like $
to be processed by the shell.
Sometimes it doesn't matter:
➜ ~ rexe 'puts "hello"'
hello
➜ ~ rexe "puts 'hello'"
hello
We can also use %q
or %Q
, and sometimes this eliminates the needs for the outer quotes altogether:
➜ ~ rexe puts %q{hello}
hello
➜ ~ rexe puts %Q{hello}
hello
Sometimes the quotes to use on the outside (quoting your command in the shell) need to be chosen based on which quotes are needed on the inside. For example, in the following command, we need double quotes in Ruby in order for interpolation to work, so we use single quotes on the outside:
➜ ~ rexe puts '"The time is now #{Time.now}"'
The time is now 2019-03-29 16:41:26 +0800
In this case we also need to use single quotes on the outside, because we need literal double quotes in a %Q{}
expression:
➜ ~ rexe 'puts %Q{The operating system name is "#{`uname`.chomp}".}'
The operating system name is "Darwin".
We can eliminate the need for any quotes in the Ruby code using %Q{}
:
➜ ~ rexe puts '%Q{The time is now #{Time.now}}'
The time is now 2019-03-29 17:06:13 +0800
Of course you can always escape the quotes with backslashes instead, but that is probably more difficult to read.
No Op Mode
The -n
no-op mode will result in the specified source code not being executed. This can sometimes be handy in conjunction with a -g
(logging) option, if you have are building a rexe command and want to inspect the configuration options before executing the Ruby code.
Mimicking Method Arguments
You may want to support arguments in your rexe commands. It's a little kludgy, but you could do this by piping in the arguments as rexe's stdin.
One of the previous examples downloaded currency conversion rates. To prepare for an example of how to do this, let's find out the available currency codes:
➜ / echo $EUR_RATES_JSON | \
rexe -ij -mb -op "self['rates'].keys.sort.join(' ')"
AUD BGN BRL CAD CHF CNY CZK DKK GBP HKD HRK HUF IDR ILS INR ISK JPY KRW MXN MYR NOK NZD PHP PLN RON RUB SEK SGD THB TRY USD ZAR
The codes output are the legal arguments that could be sent to rexe's stdin as an argument in the command below. Let's find out the Euro exchange rate for PHP, Philippine Pesos:
➜ ~ echo PHP | rexe -ml -op -rjson \
"rate = JSON.parse(ENV['EUR_RATES_JSON'])['rates'][self];\
%Q{1 EUR = #{rate} #{self}}"
1 EUR = 58.986 PHP
In this code, self
is the currency code PHP
(Philippine Peso). We have accessed the JSON text to parse from the environment variable we previously populated.
Because we "used up" stdin for the PHP
argument, we needed to read the JSON data explicitly from the environment variable, and that made the command more complex. A regular Ruby script would handle this more nicely.
Using the Clipboard for Text Processing
For editing text in an editor, rexe can be used for text transformations that would otherwise need to be done manually.
The system's commands for pasting to and copying from the clipboard can handle the moving of the text between the editor and rexe. On the Mac, we have the following commands:
-
pbcopy
- copies the content of its stdin to the clipboard -
pbpaste
- copies the content from the clipboard to its stdout
Let's say we have the following currency codes displayed on the screen (data abridged for brevity):
AUD BGN BRL PHP TRY USD ZAR
...and we want to turn them into Ruby symbols for inclusion in Ruby source code as keys in a hash whose values will be the display names of the currencies, e.g "Australian Dollar").
We could manually select that text and use system menu commands or keys to copy it to the clipboard, or we could do this:
➜ ~ echo AUD BGN BRL PHP TRY USD ZAR | pbcopy
After copying this line to the clipboard, we could run this:
➜ ~ pbpaste | rexe -ml -op \
"split.map(&:downcase).map { |s| %Q{ #{s}: '',} }.join(%Q{\n})"
aud: '',
bgn: '',
brl: '',
# ...
If I add | pbcopy
to the rexe command, then that output text would be copied into the clipboard instead of displayed in the terminal, and I could then paste it into my editor.
Using the clipboard in manual operations is handy, but using it in automated scripts is a very bad idea, since there is only one clipboard per user session. If you use the clipboard in an automated script you risk an error situation if its content is changed by another process, or, conversely, you could mess up another process when you change the content of the clipboard.
Multiline Ruby Commands
Although rexe is cleanest with short one liners, you may want to use it to include nontrivial Ruby code in your shell script as well. If you do this, you may need to add trailing backslashes to the lines of Ruby code.
What might not be so obvious is that you will often need to use semicolons as statement separators. For example, here is an example without a semicolon:
➜ ~ cowsay hello | rexe -me "print %Q{\u001b[33m} \
puts to_a"
rexe: (eval):1: syntax error, unexpected tIDENTIFIER, expecting '}'
...new { print %Q{\u001b[33m} puts to_a }
... ^~~~
The shell combines all backslash terminated lines into a single line of text, so when the Ruby interpreter sees your code, it's all in a single line:
➜ ~ cowsay hello | rexe -me "print %Q{\u001b[33m} puts to_a"
Adding the semicolon fixes the problem:
➜ ~ cowsay hello | rexe -me "print %Q{\u001b[33m}; \
puts to_a"
_______
< hello >
-------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||
Clearing the Require and Load Lists
There may be times when you have specified a load or require on the command line or in the REXE_OPTIONS
environment variable, but you want to override it for a single invocation. Here are your options:
1) Unspecify all the requires or loads with the -r!
and -l!
command line options, respectively.
2) Unspecify individual requires or loads by preceding the name with -
, e.g. -r -rails
. Array subtraction is used, and array subtraction removes all occurrences of each element of the subtracted (subtrahend) array, so:
➜ ~ rexe -n -r rails,rails,rails,-rails -gP
...
:requires=>["pp"],
...
...would show that the final -rails
cancelled all the previous rails
specifications.
We could have also extracted the requires list programmatically using $RC
(described above) by doing this:
➜ ~ rexe -oP -r rails,rails,rails,-rails '$RC[:options][:requires]'
["pp"]
Clearing All Options
You can also clear all options specified up to a certain point in time with the clear options option (-c
). This is especially useful if you have specified options in the REXE_OPTIONS
environment variable, and want to ignore all of them.
Comma Separated Requires and Loads
For consistency with the ruby
interpreter, rexe supports requires with the -r
option, but also allows grouping them together using commas:
vvvvvvvvvvvvvvvvvvvvv
➜ ~ echo $EUR_RATES_JSON | rexe -r json,awesome_print 'ap JSON.parse(STDIN.read)'
^^^^^^^^^^^^^^^^^^^^^
Files loaded with the -l
option are treated the same way.
Beware of Configured Requires
Requiring gems and modules for all invocations of rexe will make your commands simpler and more concise, but will be a waste of execution time if they are not needed. You can inspect the execution times to see just how much time is being consumed. For example, we can find out that rails takes about 0.63 seconds to load on one system by observing and comparing the execution times with and without the require (output has been abbreviated using grep
):
➜ ~ rexe -gy -r rails 2>&1 | grep duration
:duration_secs: 0.660138
➜ ~ rexe -gy 2>&1 | grep duration
:duration_secs: 0.027781
(For the above to work, the rails
gem and its dependencies need to be installed.)
Operating System Support
Rexe has been tested successfully on Mac OS, Linux, and Windows Subsystem for Linux (WSL). It is intended as a tool for the Unix shell, and, as such, no attempt is made to support Windows non-Unix shells.
More Examples
Here are some more examples to illustrate the use of rexe.
Using Rexe as a Simple Calculator
To output the result to stdout, you can either call puts
or specify the -op
option:
➜ ~ rexe puts 1 / 3.0
0.3333333333333333
or:
➜ ~ rexe -op 1 / 3.0
0.3333333333333333
Since *
is interpreted by the shell, if we do multiplication, we need to quote the expression:
➜ ~ rexe -op '2 * 7'
14
Of course, if you put the -op
in the REXE_OPTIONS
environment variable, you don't need to be explicit about the output:
➜ ~ export REXE_OPTIONS=-op
➜ ~ rexe '2 * 7'
14
Outputting ENV
Output the contents of ENV
using AwesomePrint [see footnote ^4 regarding ENV.to_s]:
➜ ~ rexe -oa ENV
{
...
"LANG" => "en_US.UTF-8",
"PWD" => "/Users/kbennett/work/rexe",
"SHELL" => "/bin/zsh",
...
}
Reformatting a Command's Output
Show disk space used/free on a Mac's main hard drive's main partition:
➜ ~ df -h | grep disk1s1 | rexe -ml \
"x = split; puts %Q{#{x[4]} Used: #{x[2]}, Avail: #{x[3]}}"
91% Used: 412Gi, Avail: 44Gi
(Note that split
is equivalent to self.split
, and because the -ml
option is used, self
is the line of text.
Formatting for Numeric Sort
Show the 3 longest file names of the current directory, with their lengths, in descending order:
➜ ~ ls | rexe -ml -op "%Q{[%4d] %s} % [length, self]" | sort -r | head -3
[ 50] Agoda_Booking_ID_9999999 49_–_RECEIPT_enclosed.pdf
[ 40] 679a5c034994544aab4635ecbd50ab73-big.jpg
[ 28] 2018-abc-2019-01-16-2340.zip
When you right align numbers using printf formatting, sorting the lines alphabetically will result in sorting them numerically as well.
Print yellow (trust me!):
This uses an ANSI escape code to output text to the terminal in yellow:
➜ ~ cowsay hello | rexe -me "print %Q{\u001b[33m}; puts to_a"
➜ ~ # or
➜ ~ cowsay hello | rexe -mb "print %Q{\u001b[33m}; puts self"
➜ ~ # or
➜ ~ cowsay hello | rexe "print %Q{\u001b[33m}; puts STDIN.read"
_______
< hello >
-------
\ ^__^
\ (oo)\_______
(__)\ )\/\
||----w |
|| ||`
More YouTube: Differentiating Success and Failure
Let's take the YouTube example from the "Loading Files" section further. Let's have the video that loads be different for the success or failure of the command.
If we put this in a load file (such as ~/.rexerc):
def play(piece_code)
pieces = {
hallelujah: "https://www.youtube.com/watch?v=IUZEtVbJT5c&t=0m20s",
rick_roll: "https://www.youtube.com/watch?v=dQw4w9WgXcQ&t=0m43s",
valkyries: "http://www.youtube.com/watch?v=P73Z6291Pt8&t=0m28s",
wm_tell: "https://www.youtube.com/watch?v=j3T8-aeOrbg",
}
`open #{Shellwords.escape(pieces.fetch(piece_code))}`
end
def play_result(success)
play(success ? :hallelujah : :rick_roll)
end
# Must pipe the exit code into this Ruby process,
# e.g. using `echo $? | rexe play_result_by_exit_code`
def play_result_by_exit_code
play_result(STDIN.read.chomp == '0')
end
Then when we issue a command that succeeds, the Hallelujah Chorus is played [see footnote ^2]:
➜ ~ uname; echo $? | rexe play_result_by_exit_code
...but when the command fails, in this case, with an executable which is not found, it plays Rick Astley's "Never Gonna Give You Up":
➜ ~ uuuuu; echo $? | rexe play_result_by_exit_code
Reformatting Source Code for Help Text
Another formatting example...I wanted to reformat this source code...
'i' => Inspect
'j' => JSON
'J' => Pretty JSON
'n' => No Output
'p' => Puts (default)
's' => to_s
'y' => YAML
...into something more suitable for my help text. Admittedly, the time it took to do this with rexe probably exceeded the time to do it manually, but it was an interesting exercise and made it easy to try different formats. Here it is, after copying the original text to the clipboard:
➜ ~ pbpaste | rexe -ml -op "sub(%q{'}, '-o').sub(%q{' =>}, %q{ })"
-oi Inspect
-oj JSON
-oJ Pretty JSON
-on No Output
-op Puts (default)
-os to_s
-oy YAML
Currency Conversion
I travel a lot, and when I visit a country for the first time I often get confused by the exchange rate. I put this in my ~/.rexerc
:
# Conversion rate to US Dollars
module Curr
module_function
def myr; 4.08 end # Malaysian Ringits
def thb; 31.72 end # Thai Baht
def usd; 1.00 end # US Dollars
def vnd; 23199.50 end # Vietnamese Dong
end
If I'm lucky enough to be at my computer when I need to do a conversion, for example, to find the value of 150 Malaysian ringits in US dollars, I can do this:
➜ rexe git:(master) ✗ rexe puts 150 / Curr.myr
36.76470588235294
Obviously rates will change over time, but this will give me a general idea, which is usually all I need.
Reformatting Grep Output
I was recently asked to provide a schema for the data in my rock_books
accounting gem. rock_books
data is intended to be very small in size, and no data base is used. Instead, the input data is parsed on every run, and reports generated on demand. However, there are data structures (actually class instances) in memory at runtime, and their classes inherit from Struct
. The definition lines look like this one:
class JournalEntry < Struct.new(:date, :acct_amounts, :doc_short_name, :description, :receipts)
The grep
command line utility prepends each of these matches with a string like this:
lib/rock_books/documents/journal_entry.rb:
So this is what worked well for me:
➜ ~ grep Struct **/*.rb | grep -v OpenStruct | rexe -ml -op \
"a = \
gsub('lib/rock_books/', '') \
.gsub('< Struct.new', '') \
.gsub('; end', '') \
.split('.rb:') \
.map(&:strip); \
\
%q{%-40s %-s} % [a[0] + %q{.rb}, a[1]]"
...which produced this output:
cmd_line/command_line_interface.rb class Command (:min_string, :max_string, :action)
documents/book_set.rb class BookSet (:run_options, :chart_of_accounts, :journals)
documents/journal.rb class Entry (:date, :amount, :acct_amounts, :description)
documents/journal_entry.rb class JournalEntry (:date, :acct_amounts, :doc_short_name, :description, :receipts)
documents/journal_entry_builder.rb class JournalEntryBuilder (:journal_entry_context)
reports/report_context.rb class ReportContext (:chart_of_accounts, :journals, :page_width)
types/account.rb class Account (:code, :type, :name)
types/account_type.rb class AccountType (:symbol, :singular_name, :plural_name)
types/acct_amount.rb class AcctAmount (:date, :code, :amount, :journal_entry_context)
types/journal_entry_context.rb class JournalEntryContext (:journal, :linenum, :line)
Although there's a lot going on in this code, the vertical and horizontal alignments and spacing make the code straightforward to follow. Here's what it does:
- grep the code base for
"Struct"
- exclude references to
"OpenStruct"
withgrep -v
- remove unwanted text with
gsub
- split the line into 1) a filespec relative to
lib/rockbooks
, and 2) the class definition - strip unwanted space because that will mess up the horizontal alignment of the output.
- use C-style printf formatting to align the text into two columns
Conclusion
Rexe is not revolutionary technology, it's just plumbing that removes parsing, formatting, and low level configuration from your command line so that you can focus on the high level task at hand.
When we consider a new piece of software, we usually think "what would this be helpful with now?". However, for me, the power of rexe is not so much what I can do with it in a single use case now, but rather what will I be able to do over time as I accumulate more experience and expertise with it.
I suggest starting to use rexe even for modest improvements in workflow, even if it doesn't seem compelling. There's a good chance that as you use it over time, new ideas will come to you and the workflow improvements will increase exponentially.
A word of caution though -- the complexity and difficulty of sharing your rexe scripts across systems will be proportional to the extent to which you use environment variables and loaded files for configuration and shared code. Be responsible and disciplined in making this configuration and code as clean and organized as possible.
Footnotes
[1]: Rexe is an embellishment of the minimal but excellent rb
script at https://github.com/thisredone/rb. I started using rb
and thought of lots of other features I would like to have, so I started working on rexe.
[2]: It's possible that when this page opens in your browser it will not play automatically. You may need to change your default browser, or change the code that opens the URL. Firefox's new (as of March 2019) version 66 suppresses autoplay; you can register exceptions to this policy: open Firefox Preferences, search for "autoplay" and add "https://www.youtube.com".
[3]: Making this truly OS-portable is a lot more complex than it looks on the surface. On Linux, xdg-open
may not be installed by default. Also, Windows Subsystem for Linux (WSL) out of the box is not able to launch graphical applications.
Here is a start at a method that opens a resource portably across operating systems:
def open_resource(resource_identifier)
command = case (`uname`.chomp)
when 'Darwin'
'open'
when 'Linux'
'xdg-open'
else
'start'
end
`#{command} #{resource_identifier}`
end
[4]: It is an interesting quirk of the Ruby language that ENV.to_s
returns "ENV"
and not the contents of the ENV
object. As a result, many of the other output formats will also return some form of "ENV"
. You can handle this by specifying ENV.to_h
.
Posted on April 15, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.