Leon Timmermans
Posted on December 15, 2020
The problem
Raku has a built-in argument parser. This is a really good idea, given how common argument parsing it, but I still ended up writing my own, and to explain why I did so, I should first explain what the built-in parser does.
The raku built-in parser is a blind parser. This means that it converts the input arguments into a Capture without any knowledge of the MAIN
sub, and then tries to call MAIN
with that capture. This has several implications.
The first is that the input syntax has to be context-free. It allows for -foo
/--foo
(meaning :foo
), -/foo
/--/foo
(meaning :!foo
) and -foo=bar
/--foo=bar
(meaning :foo(val("bar"))
). It does not allow traditional unix syntaxes such as -j2
or --jobs 2
as that would require knowing in advance that those two options take an argument.
The second issue with is is that it fails in very confusing ways. To explain this I will give some examples using zef
(chosen because of its ubiquity).
$ zef instal
Usage:
zef [--force|--force-fetch] [--timeout|--fetch-timeout=<Int>] [--degree|--fetch-degree=<Int>] [--update=<Any>] fetch [<identities> ...] -- Download specific distributions
zef [--force|--force-test] [--timeout|--test-timeout=<Int>] test [<paths> ...] -- Run tests
zef [--force|--force-build] [--timeout|--build-timeout=<Int>] build [<paths> ...] -- Run Build.pm
zef [--fetch] [--build] [--test] [--depends] [--test-depends] [--build-depends] [--force] [--force-resolve] [--force-fetch] [--force-extract] [--force-build] [--force-test] [--force-install] [--timeout=<Int>] [--fetch-timeout=<Int>] [--extract-timeout=<Int>] [--build-timeout=<Int>] [--test-timeout=<Int>] [--install-timeout=<Int>] [--degree=<Int>] [--fetch-degree=<Int>] [--test-degree=<Int>] [--dry] [--upgrade] [--deps-only] [--serial] [--contained] [--update=<Any>] [--exclude=<Any>] [--to|--install-to=<Any>] install [<wants> ...] -- Install
zef [--from|--uninstall-from=<Any>] uninstall [<identities> ...] -- Uninstall
zef [--wrap=<Int>] [--update=<Any>] search [<terms> ...] -- Get a list of possible distribution candidates for the given terms
zef [--max=<Int>] [--update=<Any>] [-i|--installed] list [<at> ...] -- A list of available modules from enabled repositories
zef [--fetch] [--build] [--test] [--depends] [--test-depends] [--build-depends] [--force] [--force-resolve] [--force-fetch] [--force-extract] [--force-build] [--force-test] [--force-install] [--timeout=<Int>] [--fetch-timeout=<Int>] [--extract-timeout=<Int>] [--build-timeout=<Int>] [--test-timeout=<Int>] [--install-timeout=<Int>] [--degree=<Int>] [--fetch-degree=<Int>] [--test-degree=<Int>] [--dry] [--update] [--serial] [--exclude=<Any>] [--to|--install-to=<Any>] upgrade [<identities> ...] -- Upgrade installed distributions (BETA)
zef [--depends] [--test-depends] [--build-depends] depends <identity> -- View dependencies of a distribution
zef [--depends] [--test-depends] [--build-depends] rdepends <identity> -- View direct reverse dependencies of a distribution
zef [--sha1] locate <identity> -- Lookup locally installed distributions by short-name, name-path, or sha1 id
zef [--update=<Any>] [--wrap=<Int>] info <identity> -- Detailed distribution information
zef [--open] browse <identity> <url-type> -- Browse a distribution's available support urls (homepage, bugtracker, source)
zef look <identity> -- Download a single module and change into its directory
zef [--fetch] [--build] [--test] [--depends] [--test-depends] [--build-depends] [--force] [--force-resolve] [--force-fetch] [--force-extract] [--force-build] [--force-test] [--force-install] [--timeout=<Int>] [--fetch-timeout=<Int>] [--extract-timeout=<Int>] [--build-timeout=<Int>] [--test-timeout=<Int>] [--install-timeout=<Int>] [--degree=<Int>] [--fetch-degree=<Int>] [--test-degree=<Int>] [--update] [--upgrade] [--dry] [--serial] [--exclude=<Any>] [--to|--install-to=<Any>] smoke -- Smoke test
zef update [<names> ...] -- Update package indexes
zef [--confirm] nuke [<names> ...] -- Nuke module installations (site, home) and repositories from config (RootDir, StoreDir, TempDir)
zef [--version] -- Detailed version information
zef [-h|--help]
The problem here is a simple typo in the word "install", but nothing in this wall of output actually hints at that.
If it can match a subcommand the error message is better/shorter, but still not terribly helpful.
$ zef install
Usage:
zef [--fetch] [--build] [--test] [--depends] [--test-depends] [--build-depends] [--force] [--force-resolve] [--force-fetch] [--force-extract] [--force-build] [--force-test] [--force-install] [--timeout=<Int>] [--fetch-timeout=<Int>] [--extract-timeout=<Int>] [--build-timeout=<Int>] [--test-timeout=<Int>] [--install-timeout=<Int>] [--degree=<Int>] [--fetch-degree=<Int>] [--test-degree=<Int>] [--dry] [--upgrade] [--deps-only] [--serial] [--contained] [--update=<Any>] [--exclude=<Any>] [--to|--install-to=<Any>] install [<wants> ...] -- Install
Instead of telling us to give a module to install, it lists all the possible arguments for this subcommand (though to be fair, this one is largely zef's fault for making that first MAIN argument mandatory).
The same error message is given for zef install Foo --timeout=3.5
(because a Rat
is not an Int
) and zef install Foo --timeout 10 --timeout 10
(it passes a two value list instead of an Int
to :$timeout
).
These error messages are not helpful (and in some cases, it being an error isn't either). The problem here is simple: Raku knows it can't dispatch the capture to any of the MAIN
candidates, but it doesn't know why. Figuring out why requires exactly the sort of introspection that it tries to avoid so hard.
But the most confusing way the argument parsing fail has to be the way it handles enums. It will interpret any known enum literal in scope as an enum as a string, e.g.:
zef install True
Cannot resolve caller new(Zef::Identity:U: Bool:D); none of these signatures match:
(Zef::Identity: Str :$name!, :ver(:$version), :$auth, :$api, :$from, *%_)
(Zef::Identity: Str $id, *%_)
It's impossible to pass Raku programs using built-in argument parsing any of the strings True
, False
, Less
, More
, Same
, a bunch of others and any enum literal you've defined in your script as strings because they'll be converted into something else entirely.
The solution
So instead of using the built-in argument parser, I wrote my own argument parsing module: Getopt::Long. Unlike the built-in one, it is contextual. It will first look at the sub MAIN
to know what arguments to expect, than parse based on that, and then call MAIN
. This way, one can parse -j2
to :j(2)
, and --jobs 2
to :jobs(2)
. This results in a far more unixish interface than what is possible using the default parsing.
But there's a second advantage: better error messages. It will try its hardest to give an informative error message. For example:
Unknown option --foo
Option --foo doesn't take arguments
Cannot convert --foo argument "10a" to number: trailing characters after number
Invalid Date '20-02-20' given as --foo argument; use yyyy-mm-dd instead
It tries very hard to detect any potential issue before dispatching is done, so that it can give an informative error message.
This should make it much easier for the user of a Raku program to figure out what they did wrong, while at the same time offering a much more standard interface to your users. It allows for a series of options to make the external interface either more unixish (the default) or more like raku's argument parsing (for backwards compatibility).
Getopt::Long offers both a functional interface (much like the Perl module that inspired it), and a MAIN
wrapper that can function as a drop in replacement of the existing argument parser: all you have to do to benefit from this is add a use Getopt::Long;
to your script!
Posted on December 15, 2020
Join Our Newsletter. No Spam, Only the good stuff.
Sign up to receive the latest update from our blog.