Background¶
Command-line invocation interfaces are surprisingly complicated, and Invoke’s needs are a little on the unique side. This document outlines the various possible approaches to this problem, and why Invoke ended up with the mechanism it currently provides.
Note
This is purely design documentation; you don’t need to understand it in order to use the software!
Basics¶
Flags/options, and arguments given to them, are pretty universal:
$ invoke --boolean
$ invoke --takes-arg value
$ invoke --takes-arg=value
$ invoke -s
$ invoke -s svalue
Using “standard” GNU style long and short flags is a no-brainer. A less common syntax is “Java style” where verbose/long flags can be given with a single dash:
$ java -jar path/to/jar
We’re not a fan of this style so it was never under consideration. (That said, Invoke’s parser mechanism is generic enough to support it with only minor tweaks.)
Short flag peculiarities¶
Implementation of short flags tends to vary, re: whether an equals sign is
allowed (program -f=bar
), and sometimes even whether a space is required
(tail -n5
).
There’s also combined short flags (tar -xzv
standing in for tar -x -z
-v
) which only work for boolean options as they’d be ambiguous otherwise (are
the 2nd through Nth characters other combined short flags, or are they a value
intended for the 1st short flag?)
Invoke supports both of these features: equals signs may be used with short flags just as with long flags (but are not required), and Boolean short flags may be combined.
The multiple tasks vs positional args problem¶
This is the meat. Invoke really wanted to have multiple tasks in the spirit of Fabric 1.x, which supported invocations like:
$ fab task1:arg1,arg2=val2 task2:args
though we didn’t want to continue using that specific invoke style. Instead we wanted something that looked like this:
$ invoke --global-opts task1 --task1-opts task2 --task2-opts [...]
Unfortunately argparse
and friends are unable to do this, because they:
- have no way of “partitioning” input, so all flags/options get lumped together; and
- consider any non-flag input to be “positional arguments” and toss them into a simple list with no way to tell where they showed up in the original invocation.
That leaves us on our own.
The naive approach¶
The main hurdle here is determining when one task’s “chunk” of the input ends and another begins. In a really simple use case, we could just say “anything that’s not a flag is a task name”. Our earlier example, repeated below, is exactly that simple:
$ invoke --global-opts task1 --task1-opts task2 --task2-opts [...]
Using the “not a flag” heuristic above gives us nicely delineated chunks to
perhaps hand off to some argparse
based sub-parsers.
Unfortunately, this breaks down as soon as you get into non-boolean options
(i.e. options that take values/arguments.) E.g. what if task1
can be
parameterized by some kwarg
that takes a value:
$ invoke --global-opts task1 --kwarg value task2
In this case, our naive parser will think value
is a task name, when it’s
actually supposed to be the value given to kwarg
. Not good!
Telling tasks apart¶
There are multiple ways to solve this problem and tell tasks apart from flag arguments, each with pluses/minuses:
Give the parser deep knowledge of the flags involved so that when it encounters
--flag value task
it knows that--flag
is supposed to have a value after it. This can be done explicitly (thinkargparse
style argument definitions) or can be inferred from the task definition.- Unfortunately, we can’t use
argparse
oroptparse
to do this as they are unable to cope with various edge cases that arise in a multi-task situation. - For example, while
optparse
can be told to stop at the first unknown positional argument (in our case, the next task name) and could thus be used in a loop, it still thinks about the flags it encounters globally. Two tasks using the same flag name would make usingoptparse
impossible. Requiring globally unique flag names would solve this, but is too user-hostile.
- Unfortunately, we can’t use
Force users to avoid putting spaces between flags and their arguments, i.e. always doing
--flag=value
or-fvalue
. This makes task name tokens unambiguous, at the cost of cutting out a very commonly used invocation style (--flag value
). Even worse, such an implementation falls into the trap of looking very similar to, but not behaving like, all other programs using “normal” flag syntax.Add special syntax to denote task names, e.g.:
$ invoke --global-opts task1: --kwarg value task2: --task2-opts
where the trailing colon is a sign that that particular “word” is a task name. This is unorthodox but not terrifically ugly and could ease parsing.
Alternately, use standalone sentinel characters in-between the task/arg “chunks” which have no special shell meaning, e.g.:
$ invoke --global-opts , task1 --kwarg value , task2 --task2-opts , [...]
While this would be the simplest to parse, it’s also pretty unappealing.
Per-task positional args¶
An additional benefit of the second two approaches above is that they would enable support for arbitrary per-task positional arguments, e.g.:
$ invoke --global-opts task1: arg1 arg2 --kwarg=value task2: --kwarg2=value2
The first approach – using knowledge about the tasks involved to determine where boundaries start and begin – requires more complexity in the parser engine to handle positional arguments. Even then, it is not possible to have truly arbitrary positional args (meaning an undetermined until runtime number) due to the ambiguity between “the next positional arg for this task” and “the next task name”.
Invoke’s approach¶
At this point in time, Invoke has gone with the first option: relying on rich task & argument objects being available to the parser, which uses them to determine whether a given token is a flag, a value/argument to a flag, a positional argument, or a task name.
This preserves “traditional” style invocations (no special sigils denoting task boundaries) with the only tradeoff being that arbitrary numbers of positional arguments are not possible.
Space-delimited flag values that look like flags themselves¶
This is an interesting oddity that arises in all types of task/argument parsing. Consider this invocation:
$ invoke --takes-a-value --some-other-valid-flag
It can be interpreted in two ways:
--takes-a-value
having its value set to"--some-other-flag"
- Pluses: allows specifying flag-like values, which would otherwise have to be escaped in some fashion.
- Minuses: can obscure user error.
--some-other-valid-flag
being interpreted as an actual flag, and an error being generated because--takes-a-value
is then missing a value.- Has the inverse tradeoff to the above: fast-fails on user error, but would require escaping for actual flag-like values to be treated as flag arguments.
A related issue is the possibility of invalid flag-like values, e.g.:
$ invoke --takes-a-value --not-even-a-valid-flag
This doesn’t even make sense in the 2nd approach above, because now we’ve both got a “missing value” error and a “unknown flag” error, whereas the 1st approach still works as the user probably intended.