Andrew (he/him)
Posted on June 6, 2019
Finding Unique Shell Variables
You can see which shells are available on your machine with:
$ ls -lS /bin/*sh*
-r-xr-xr-x 1 root wheel 1278736 21 Mar 06:11 /bin/ksh*
-r-xr-xr-x 1 root wheel 618480 21 Mar 06:11 /bin/sh*
-r-xr-xr-x 1 root wheel 618416 21 Mar 06:11 /bin/bash*
-rwxr-xr-x 1 root wheel 610240 21 Mar 06:11 /bin/zsh*
-rwxr-xr-x 1 root wheel 375824 21 Mar 06:11 /bin/csh*
-rwxr-xr-x 1 root wheel 375824 21 Mar 06:11 /bin/tcsh*
This command works on all of the shells listed above.
...the -S
flag orders them by size. See how csh
and tcsh
are exactly the same size? tcsh
is just csh
with more features -- csh
is a proper subset of tcsh
, so some operating systems simply redirect csh
to tcsh
, to avoid maintaining the two shells separately.
But how can you tell which shell is which? Running echo $0
on each of these shells yields the following results (on my macOS Mojave 10.14.4 OS):
bash$ echo $0
-bash
csh$ echo $0
csh
ksh$ echo $0
ksh
sh$ echo $0
sh
tcsh$ echo $0
tcsh
zsh$ echo $0
zsh
bash
is prefaced with a-
because it's the one I've logged in under (and I'm opening all the other shells from it). This makes this particularbash
shell a login shell, which is prefaced with a-
.
This is fine, except csh
and tcsh
are really the same tcsh
shell. Is there any way we can discover this, through environment variables or otherwise? There is! This SO answer gives some guidelines for empirically determining which shell we're using:
-
$ZSH_NAME
is only set (by default) onzsh
-
$version
is only set (by default) ontcsh
...also...
-
$BASH_VERSION
is only set (by default) onbash
-
$ZSH_VERSION
is only set (by default) onzsh
-
$KSH_VERSION
is only set (by default) onksh
Finally, to discover any other shell-specific variables, we can use this gist. We can send the output of the command at the gist to shell-specific files (like ~/.bash.vars.out
, ~/.csh.vars.out
, etc.), find only the unique ones, and discover which file they're in with:
$ sort ~/.*.vars.out | uniq -u | sed -e 's/^/^/' -e 's/$/$/' > ~/.grep.tmp
$ grep -Hf ~/.grep.tmp ~/.*.vars.out
/Users/andrew/.bash.vars.out:BASH_REMATCH
/Users/andrew/.bash.vars.out:CLICOLOR
/Users/andrew/.bash.vars.out:LSCOLORS
...
This, again, is shell-agnostic. So long as the shell has the
sort
,uniq
,sed
, andgrep
commands (all part of the UNIX standard since 1979), the above will work the same on any shell. To test this assertion, I tried this command onfish
, as well, and got a list offish
variables!
The above commands are a bit complex, so let me explain what is going on here, starting with the first part:
sort ~/.*.vars.out | uniq -u
This concatenates all of the ~/.*.vars.out
files, sorts them alphabetically line-by-line, and finds unique (-u
) lines only. So if a shell variable is repeated across multiple files, it is removed from the list entirely. Sample output:
$ sort ~/.*.vars.out
ARGC
Apple_PubSub_Socket_Render
Apple_PubSub_Socket_Render
Apple_PubSub_Socket_Render
Apple_PubSub_Socket_Render
BASH
BASH
BASH_ARGC
BASH_ARGC
BASH_ARGV
BASH_ARGV
BASH_LINENO
BASH_LINENO
...
$ sort ~/.*.vars.out | uniq -u
ARGC
BASH_REMATCH
CDPATH
CLICOLOR
CMD_DURATION
...
...note how all of the repeated variables are removed. Next, we send this output to sed
to add a ^
to the beginning of each line and a $
to the end of each line:
$ sort ~/.*.vars.out | uniq -u | sed -e 's/^/^/' -e 's/$/$/'
^ARGC$
^BASH_REMATCH$
^CDPATH$
^CLICOLOR$
^CPUTYPE$
...
(This means, when we search for the unique ARGC
across all of these ~/.*.vars.out
files, we won't erroneously pick up BASH_ARGC
, etc.) Next, we save this to a temporary file with the redirect >
:
$ sort ~/.*.vars.out | uniq -u | sed -e 's/^/^/' -e 's/$/$/' > ~/.grep.tmp
Finally, we search for the lines of this temporary file within the ~/.*.vars.out
files, using grep
:
$ grep -Hf ~/.grep.tmp ~/.*.vars.out
/Users/andrew/.bash.vars.out:BASH_REMATCH
/Users/andrew/.bash.vars.out:CLICOLOR
/Users/andrew/.bash.vars.out:LSCOLORS
...
/Users/andrew/.fish.vars.out:CMD_DURATION
/Users/andrew/.fish.vars.out:FISH_VERSION
/Users/andrew/.fish.vars.out:GOSU_VERSION
...
/Users/andrew/.ksh.vars.out:ENV
/Users/andrew/.ksh.vars.out:FCEDIT
/Users/andrew/.ksh.vars.out:JOBMAX
...
/Users/andrew/.sh.vars.out:POSIXLY_CORRECT
/Users/andrew/.zsh.vars.out:ARGC
/Users/andrew/.zsh.vars.out:CDPATH
/Users/andrew/.zsh.vars.out:CPUTYPE
...
All of the above commands are shell-agnostic and will work the same across any of the aforementioned shells. There are lots of unique bash
, ksh
, and zsh
variables, which can help us distinguish these shells from one another. However, sh
has only a single unique shell variable, POSIXLY_CORRECT
.
Note also that csh
and tcsh
are completely absent from this list. This is because, on my machine, csh
simply redirects to tcsh
(I do not have csh
available, really). To account for this, let's remove the ~/.csh.vars.out
file and reapeat the analysis above. The new results look something like:
$ rm ~/.csh.vars.out
$ sort ~/.*.vars.out | uniq -u | sed -e 's/^/^/' -e 's/$/$/' > ~/.grep.tmp
$ grep -Hf ~/.grep.tmp ~/.*.vars.out
/Users/andrew/.bash.vars.out:BASH_REMATCH
/Users/andrew/.bash.vars.out:CLICOLOR
/Users/andrew/.bash.vars.out:LSCOLORS
...
/Users/andrew/.fish.vars.out:CMD_DURATION
/Users/andrew/.fish.vars.out:FISH_VERSION
/Users/andrew/.fish.vars.out:GOSU_VERSION
...
/Users/andrew/.ksh.vars.out:ENV
/Users/andrew/.ksh.vars.out:FCEDIT
/Users/andrew/.ksh.vars.out:JOBMAX
...
/Users/andrew/.sh.vars.out:POSIXLY_CORRECT
/Users/andrew/.tcsh.vars.out:addsuffix
/Users/andrew/.tcsh.vars.out:anyerror
/Users/andrew/.tcsh.vars.out:csubstnonl
...
/Users/andrew/.zsh.vars.out:ARGC
/Users/andrew/.zsh.vars.out:CDPATH
/Users/andrew/.zsh.vars.out:CPUTYPE
...
Great! Now we've got a list of variables which only appear on these shells, and we can use this list to determine (within some degree of certainty) which shell a user is working with. Most importantly, as well, is that the above code will work on any shell on which it is run (of the above listed shells) because only common utilities like grep
, sed
, and sort
are used, and no shell-specific features are used anywhere.
Determining The Shell
To determine what shell a user is working with, based on their available environment / shell variables, we should make lists of which variables we expect to see (and which ones we do not expect to see) from the above list. We can construct these lists by splitting the above output on the name of the file in which each variable was found:
$ grep -f ~/.grep.tmp ~/.*.vars.out | awk '{split($0,a,":"); print "echo \""a[2]"\" >> ~/."substr(a[1],16,length(a[1])-23)"uniqvars;"}'
Again, this is complex, so let's break it down step-by-step:
grep -f ~/.grep.tmp ~/.*.vars.out
...this whole first bit is just what we had above, but this time, instead of printing it to the terminal, we're going to pipe it to awk
:
... | awk '{split($0,a,":"); print "echo \""a[2]"\" >> ~/."substr(a[1],16,length(a[1])-23)"uniqvars;"}'
Here, we take the input string (our list of unique variables and filenames) and split on :
characters. This just separates the name of the file in which the variable was found from the name of the variable. This creates an array in awk
, where the 1
-index element is the first token, the 2
-index element is the second token, etc. and tokens are understood to be delimited by the given character (:
). We also do a bit of substring extraction (to remove the leading /Users/andrew/
and the trailing .vars.out
.
Finally, we simply print the results to the terminal with the print
statement, and some formatting, to get:
$ grep -f ~/.grep.tmp ~/.*.vars.out | awk '{split($0,a,":"); print "echo \""a[2]"\" >> ~/."substr(a[1],16,length(a[1])-23)"uniqvars;"}'
echo "BASH_REMATCH" >> ~/.bash.uniqvars;
echo "CLICOLOR" >> ~/.bash.uniqvars;
echo "LSCOLORS" >> ~/.bash.uniqvars;
...
It should be pretty clear what we're going to do next, we're going to run this as a series of commands instead of printing them to the screen. This will append each of these variables to some new ~/.*.uniqvars
files:
$ grep -f ~/.grep.tmp ~/.*.vars.out | awk '{split($0,a,":"); print "echo \""a[2]"\" >> ~/."substr(a[1],16,length(a[1])-23)"uniqvars;"}' > ~/.make.varfiles.tmp
$ chmod +x ~/.make.varfiles.tmp && eval ~/.make.varfiles.tmp
We'll now have a few uniqvars
files in the ~
directory:
$ ls ~/.*.uniqvars
/Users/andrew/.bash.uniqvars /Users/andrew/.ksh.uniqvars /Users/andrew/.tcsh.uniqvars
/Users/andrew/.fish.uniqvars /Users/andrew/.sh.uniqvars /Users/andrew/.zsh.uniqvars
Great! Now, we can simply run our command from before, which gave us all of the environment and shell variables from this shell:
set | grep "^[a-zA-Z]" | sed -e 's/[[:space:]].*//' -e 's/=.*//'
...and compare it to each of the ~/.*.uniqvars
files:
$ set | grep "^[a-zA-Z]" | sed -e 's/[[:space:]].*//' -e 's/=.*//' > ~/.shellvars.tmp
Now, compare to each file:
$ comm -12 ~/.shellvars.tmp ~/.bash.uniqvars | wc -l
0
$ comm -12 ~/.shellvars.tmp ~/.fish.uniqvars | wc -l
0
$ comm -12 ~/.shellvars.tmp ~/.ksh.uniqvars | wc -l
0
$ comm -12 ~/.shellvars.tmp ~/.sh.uniqvars | wc -l
0
$ comm -12 ~/.shellvars.tmp ~/.tcsh.uniqvars | wc -l
23
$ comm -12 ~/.shellvars.tmp ~/.zsh.uniqvars | wc -l
0
We have 0 lines in common with any of the *.uniqvars
files, except the tcsh.uniqvars
file. How many lines are in each file, though?
$ wc -l ~/.bash.uniqvars
21 /Users/andrew/.bash.uniqvars
$ wc -l ~/.fish.uniqvars
40 /Users/andrew/.fish.uniqvars
$ wc -l ~/.ksh.uniqvars
5 /Users/andrew/.ksh.uniqvars
$ wc -l ~/.sh.uniqvars
1 /Users/andrew/.sh.uniqvars
$ wc -l ~/.tcsh.uniqvars
23 /Users/andrew/.tcsh.uniqvars
$ wc -l ~/.zsh.uniqvars
89 /Users/andrew/.zsh.uniqvars
So this shell matches 23/23 of the expected unique shell and environment variables for a tcsh
, and none of the expected unique shell and environment variables for any of the other shells. We can say with 100% confidence that this is a tcsh
! But...
$ echo $0
csh
This is the shell which advertised itself as csh
!
Next Steps
Ideally, our next task would be to package this up into a nice shell script, which could optionally give a confidence rating on the type of shell determined. But the reason we needed to use temporary files throughout this exercise is because shell variables are defined in totally different ways for the t
/csh
shells vs. other kinds of shells.
As it turns out, there are some big difficulties in implementing a cross-shell script or command which can determine the shell type in this way:
-
ksh
doesn't have asource
shell command, but... -
.
(dot) is defined in all shells and can be used to run a script, but... - we cannot use shell variables because they are defined differently between
t
/csh
shells vs. other shells, so... - we can try to use awk's
system()
command, but it opens ansh
shell only, so... - we can try to write a C script and use
stdlib::execl
, but thesh
shell is used by default (different shells can be specified, but that sort of defeats the purpose of this exercise)...
...so it looks like we're stuck!
Anyone have any ideas for getting this to run on any shell?
Posted on June 6, 2019
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.