=encoding utf8 =head1 TITLE DRAFT: Synopsis 19: Command Line Interface =head1 VERSION Created: 12 Dec 2008 Last Modified: 06 Oct 2014 Version: 28 This is a draft document. This document describes the command line interface. It has changed extensively from previous versions of Perl in order to increase clarity, consistency, and extensibility. Many of the syntax revisions are extensions, so you'll find that much of the Perl 5 syntax embedded in your muscle memory will still work. Notable features described in the sections below include: =over 4 =item * A smart default command-line processor in the core =item * All options have a long, descriptive name for increased clarity =item * Common options have a short, single-character name, and allow clustering =item * Extended option syntax provides the ability to set boolean true/false =item * New C<++> metasyntax allows options to be passed through to subsystems =back This interface to Perl 6 is special in that it occurs at the intersection of the program and the operating system's command line shell, and thus is not accessed via a consistent syntax everywhere. A few assumptions are made here, which will hopefully stand the test of time: All command-line arguments are assumed to be in Unicode unless proven otherwise; and Perl is born of Unix, and as such the syntax presented in this document is expected to work in a Unix-style shell. To explore the particularities of other operating systems, see L (TBD). =head1 Command Line Elements The command line is broken down into two basic elements: a I, and I. Each command line element is whitespace separated, so elements containing whitespace must be quoted. The I processes the arguments and performs the requested actions. For rakudo compiled with both MoarVM and JVM backends, respectively C and C. With one compiled backend, the command can be shortened to C It is followed by zero or more I. The I portion of the command line is made available at run-time via the C<< PROCESS::<$PROGRAM> >> variable (normally accessed as C<$*PROGRAM>) as an C. Command line I are broken down into I and I. Each option may take zero or more values. After all options have been processed, the remaining values (if any) generally consist of the name of a script for Perl to execute, followed by arguments for that script. If no values remain, Perl 6 implicitly opens STDIN to read the script. If you wish to pass arguments to a script read from STDIN, you must specify STDIN by name (C<-> on most operating systems). =head1 Backward (In)compatibility You may find yourself typing your favorite Perl 5 options, even after Christmas has arrived. As you'll see below, common options are provided which behave similarly. Less common options, however, may not be available or may have changed syntax. If you provide Perl with unrecognized command-line syntax, Perl gives you a friendly error message. If the unrecognized syntax is a valid Perl 5 option, Perl provides helpful suggestions to allow you to perform the same action using the current syntax. =head2 Unchanged Syntactic Features Several features have not changed from Perl 5, including: =over 4 =item * The most common options have a single-character short name =item * Single-character options may be clustered with the same syntax and semantics =item * Many command-line options behave similarly, for example: Option... Still means... -a Autosplit -c Check syntax -e *line* Execute -F *expression* Specify autosplit field separator -h Display help and exit -I *directory*[,*directory*[,...]] Unshift CompUnitRepo::Local('s) to @?INC -n Act like awk -p Act like sed -S Search PATH for script -T Enable taint mode -v Display version info -V Display verbose config info All of these options have extended syntax, and some may have slightly different semantics, so see L below for the details. =back =head2 Removed Syntactic Features Some Perl 5 command-line features are no longer available, either because there's a new and different way to do it in Perl 6, or because they're no longer relevant. Here's a breakdown of what's been removed: =over 4 =item -0 *octal/hex* Sets input record separator. Missing due to lack of specification in L. There is a comment about this in the L section at the end of this document. =item -C *number/list* Control Unicode features. Perl 6 has Unicode semantics and assumes a UTF-8 command-line interface (until proven otherwise, at which point this functionality may be readdressed). =item -d, -dt, -d:foo, -D, etc. Debugging commands. Replaced with the C<++BUG> metasyntactic option. =item -E *line* Execute a line of code, with all features enabled. This is specific to Perl 5.10, and not relevant to Perl 6, where C<-e> performs this function. =item -i *extension* Modify files in-place. Haven't thought about it enough to add yet, but I'm certain it has a strong following. {{TODO review decision here}} =item -l Enable automatic line-ending processing. This is the default behavior. =item -M *module*, -m *module*, etc. use/no module. Replaced by C<--use>. =item -P Run option through C preprocessor. This caused problems for Perl 5, and is completely obsolete now. =item -s Enable rudimentary switch parsing. By default, Perl 6 parses the arguments passed to a script using the signature supplied by the user in the MAIN routine (see L). =item -t Enable taint warnings mode. Taint mode needs more thought, but it's much more likely that the C<-T> switch will take options rather than use a second command-line flag to implement related behavior. =item -u Dump the core after parsing the program. This has been deemed obsolete. =item -U Allow unsafe operations. This is extremely dangerous and infrequently used, and doesn't deserve its own command-line option. =item -w Enable warnings. This is the default behavior. =item -W Enable all warnings. This is infrequently used, and doesn't deserve its own command-line option. =item -X Disable all warnings. This is infrequently used, and doesn't deserve its own command-line option. =back =head1 Options and Values Command line options are parsed using the following rules: =over 4 =item * Options must begin with one of the following symbols: C<-->, C<->, or C<:>. =item * Options are case sensitive. C<-o> and C<-O> are not the same option. =item * All options have a multi-character, descriptive name for increased clarity. Multi-character option names always begin with C<--> or C<:>. =item * Common options have a short, one-character name for speed. Single-character names always begin with C<->. =item * Single-character options may be clustered. C<-ab> means C<-a -b>. When a single-character option which requires a value is clustered, the option may appear only in the final position of the cluster. =item * Options may be negated with C, for example C<--/name>, C<:/name>, C<-/n>. Negated single-character options cannot appear in a cluster. In practice, negated options are rare anyway, as most boolean options default to False. =item * Option names follow Perl 6 identifier naming convention, except C<'> is not allowed, and single-character options may be any character or number. =item * The special option C<--> and C<< -e >> signal the parser to stop option processing. Arguments following a bare C<--> (with no identifier) or a C<-e> are always parsed as a list of values, even if they look like valid options. =back Delimited options allow you to transparently pass one or more options through to a subsystem, as specified by the special options that delimit those options. They are parsed according to the following rules: =over 4 =item * The opening and closing delimiters begin with two or more plus characters, for example C<++>. You'll usually use two plus characters, but more are allowed to avoid ambiguity when nesting delimited options. =item * Opening and closing delimited option names follow option identifier naming convention, defined above. =item * If the closing delimiter is omitted, the rest of the command line is consumed. =item * Inside a delimited option, the C<--> option does not suppress searching for the closing delimiter. That is, only the rest of the arguments within the delimiters are treated as values. =item * Eager matching semantics are used, so the first closing delimiter found completes the match. =item * Delimited options cannot be negated. However, the final delimiter takes a slash indicating the termination of the delimited processing, much like a closing HTML tag. =back These options are made available in dynamic variables matching their name, and are invisible to C except as C<< %*OPTSZ<> >>. For example: ++PARSER --setting=Perl6-autoloop-no-print ++/PARSER is available inside your script as C<< %*OPTSZ<> >>, and contains C<--setting=Perl6-autoloop-no-print>. Since eager matching is used, if you need to pass something like: ++foo -bar ++foo baz ++/foo ++/foo you'll end up with %*OPTS = '-bar ++foo baz'; which is probably not what you wanted. Instead, add extra C<+> characters +++foo -bar ++foo baz ++/foo +++/foo which will give you %*OPTS = '-bar ++foo baz ++/foo'; allowing you to properly nest delimited options. The actual storage location of C<%*OPTS> may be either in C<< PROCESS::<%OPTS> >> or C<< GLOBAL::<%OPTS> >>, depending on how the process sets up its interpreters. Values are parsed with the following rules: =over 4 =item * Values are passed to options with the following syntax C<--option=value> or C<--option value>. =item * Values containing whitespace must be enclosed in quotes, for example C<-O="spacey value"> =item * Multiple values are passed by specifying multiple instances of the option, as in C<--option=val1 --option='val 2'>. =back =head2 Remaining arguments Any remaining arguments to the Perl 6 program are placed in the C<@*ARGS> array. =head1 Option Reference Perl 6 options, descriptions, and services. =head2 Synopsis multi sub perl6( Bool :a($autoloop-comb), Bool :c($check-syntax), Bool :$doc, :e($execute), :F($autoloop-delim), Bool :h($help), :I(@include), :L($language), Bool :n($autoloop-no-print), :O($output-format), Bool :p($autoloop-print), Bool :S($search-path), Bool :T($taint), :u($use), Bool :v($version), Bool :V($verbose-config), Bool :x($extract-from-text), ); =head2 Reference =over 4 =item --autoloop-comb, -a When used with C<-n> or C<-p>, implicitly combs input and assigns the result to C<@_> within the loop produced by the C<-n> or C<-p>. The default pattern is C, an alternate pattern for comb may be specified with C<--autoloop-pattern>, a.k.a. C<-F>. =item ++CMD --command-line-parser *parser* ++/CMD Add a command-line processor. When this option is parsed, it immediately triggers an action that affects or replaces the command-line parser. Therefore, it is a good idea to put this option as early as possible in the argument list. =item --check-syntax, -c Check syntax, then exit. Desugars to C<-e 'CHECK { compiles_ok(); exit; }'>. =item --doc Lookup Perl documentation in Pod format. Desugars to C<-e 'CHECK { compiles_ok(); dump_perldoc(); }'>. C<@*ARGS> contains the arguments passed to C, and is available at C time, so C can respond to command-line options. {{TODO may create a ++DOC subsystem here. also, may use -d for short name, even though it clashes with perl 5}} =item ++BUG [*switches*, *flags*] ++/BUG Set switches and flags for the debugger. Note: The debugger needs further specification. =item --execute, -e *line* Execute a single-line program in lax mode. If you don't wish to run in lax mode, but with strictures and warnings enabled, pass 'use strict;' at the -e on the command line, like C<-e 'use strict; my $x = 42'>. =item --autoloop-delim, -F *expression* Pattern to split on (used with -a). Substitutes an expression for the default split function, which is C<{split ' '}>. Accepts Unicode strings (as long as your shell lets you pass them). Allows passing a closure (e.g. -F "{use Text::CSV}"). Awk's not better any more :) =item --help, -h Print summary of options. Desugars to C<++CMD --print-help --exit ++/CMD>. =item --include, -I location[,Class=location[,...]] Prepend C's to @?INC, for ad hoc module searching. The L class will be assumed if no class has been specified. Any class specified should adhere to the L interface. In any Searching the standard library follows the policies laid out in L. =item --language, -L *dsl* Set the domain specific language for parsing the script file. (That is, specify the I (often known as the prelude) for the program.) C<++PARSER --setting=*dsl* ++/PARSER>. =item --autoloop-no-print, -n Act like awk. Desugars to C<++PARSER --setting=Perl6-autoloop-no-print ++/PARSER>. =item --output-format, -O *format* Emit compiler output to STDOUT in the specified format, rather than invoking the compiled code immediately. This option is implementation-specific, so consult the documentation for your Perl 6 implementation for further details. =item --autoloop-print, -p Act like sed. Desugars to C<++PARSER --setting=Perl6-autoloop-print ++/PARSER>. =item --search-path, -S Use PATH environment variable to search for script specified on command-line. =item --taint, -T Turns on "taint" checking. See L for details. Commits very early. Put this option as early on the command-line as possible. =item --use, -u *module* C<--use *module*> and C<-u *module*> desugars to C<-e 'use *module*'>. Specify version info and import symbols by appending info to the module name: -u'Sense:auth:ver<1.2.1> ' You'll need the quotes so your shell doesn't complain about redirection. There is no special command-line syntax for C<'no *module*>, use C<-e>. =item --version, -v Display program name, version, patchlevel, etc. Desugars to C<++CMD -v ++/CMD ++PARSER -v ++/PARSER ++BUG -v ++/BUG>. =item --verbose-config, -V Display configuration details. Desugars to C<++CMD -V ++/CMD ++PARSER -V ++/PARSER ++BUG -V ++/BUG>. =item --extract-from-text, -x Run program embedded in Unicode text. Scan for the first line starting with C<#!> and containing the word C, and start there instead. This is useful for running a program embedded in a larger message. (In this case you would indicate the end of the program using the C<=END> block, as defined in L.) Desugars to C<--PARSER --Perl6-extract-from-text --/PARSER>. =back =head1 Metasyntactic Options Metasyntactic options are a subset of delimited options used to pass arguments to an underlying component of Perl. Perl itself does not parse these options, but makes them available to run-time components via the C<%*META-ARGS> dynamic variable. Standard in Perl 6 are three underlying components, C, C, and C. Implementations may expose other components via this interface, so consult the documentation for your Perl 6 implementation. On command line... Subsystem gets... ++X a -b ++/X a -b # Nested options +++X a -b ++X -c ++/X -d e +++/X a -b ++X -c ++/X -d e # More than once (both are valid, but the second form is preferred) ++X a -b ++/X -c ++X -d e ++/X a -b -d e +++X a -b +++/X -c ++X -d e ++/X a -b -d e =head1 Environment Variables Environment variables may be used to the same effect as command-line arguments. =over 4 =item PATH Used in executing subprocesses, and for finding the program if the -S switch is used. =item PERL6LIB A list of directories in which to look for ad hoc Perl library files. Note: this is speculative, as library loading is not yet specified, except insofar as S11 mandates various behaviors incompatible with mere directory probing. =item PERL6OPT Default command-line arguments. Arguments found here are prepended to the list of arguments provided on the command-line. =back =head1 References =over 4 =item L =item L =item L =item L =item L =item L =item L =back =head1 Notes I'd like to be able to adjust the input record separator from command line, for instance to specify the equivalent of perl 5's C<$/ = \32768;>. So far, I don't have a solution, but perhaps pass a closure that evaluates to an Int? This should try to use whatever option does the same thing to a new filehandle when S16 is further developed. =begin consideration [probably a setter method on $*IN of some sort? --law] Sandboxing? maybe -r Env var? maybe -E. Could be posed in terms of substituting a different setting. =end consideration =head1 AUTHORS Jerry Gay Moritz Lenz =for vim:set expandtab sw=4: