Improving command line argument processing

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
8 messages Options
Reply | Threaded
Open this post in threaded view
|

Improving command line argument processing

David T. Lewis
Background:

Motivated by the recent "Image not startable after save" discussion, I was
looking to see if there might be a way to inject a startup patch script,
specified as a command line option, that would be evaluated prior to any
of the processStartupList: processing.

That led me to notice that '--' handling in the command line is currently
broken in trunk (see the -help message from the VM for intended behavior).

So I fixed that, but then I noticed that argument processing was generally
all bolloxed up and worked inconsistently.

So I fixed that, thinking that I might put it in the inbox. But then I
noticed that VM is also inconsistent with respect to passing the '--' token
to the image in the VM parameters.

So I fixed that, thinking that I could do a pull request to opensmalltalk-vm
to get that resolved.

Then in noticed that the readDocumentAtStartup preference serves no useful
purpose once the argument processing is fixed, so I eliminated use of the
preference, so it can be deprecated and removed at a later date.

But in total, this is a a lot of change to something that never working
right in the first place, so I would like to summarize my idea of how of
I think it *should* work for review before posting any code. If this
seems reasonable, I'll put my changes in the inbox.

Here is how I think it should work, showing first a unix command line, and
then the expect arguments as seen in the image:

  $ squeak squeak.image -- arg1 arg2 arg3 "do not treat arg1 as a script"
  Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
  (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)

  $ squeak squeak.image script.st -- arg1 arg2 arg3 "script.st runs"
  Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
  (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)

  $ squeak squeak.image -- script.st arg1 arg2 arg3 "script.st does not run"
  Smalltalk arguments ==> #('start.st' 'arg1' 'arg2' 'arg3')
  (1 to: 5) collect: [:i | Smalltalk argumentAt: i ] ==> #('start.st' 'arg1' 'arg2' 'arg3' nil)

  $ squeak squeak.image script.st arg1 arg2 arg3 "script.st runs"
  Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
  (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)

  $ squeak script.st arg1 arg2 arg3 ==> error image not found

  $ squeak script.st -- arg1 arg2 arg3 ==> error image not found

  "Preferred bahaviour, but VM patch required"
  $ squeak -- arg1 arg2 arg3 "use default image name squeak.image and no start script"
  Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
  (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)
 
  "Without VM patch, this is the expected (but not desirable) behavior, because
  the VM fails to provide '--' in vmParameters in this case only. Note that the
  current workaround is simply to specify the image name on the command line."
  $ squeak -- arg1 arg2 arg3 ==> Error: no content to install
  Smalltalk arguments ==> #('arg2' 'arg3')
  (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] #('arg2' 'arg3' nil nil)

Dave

Reply | Threaded
Open this post in threaded view
|

Re: Improving command line argument processing

marcel.taeumel
Hi Dave!

Thanks for looking into this! :-)

So I fixed that, thinking that I might put it in the inbox. But then I
noticed that VM is also inconsistent with respect to passing the '--' token
to the image in the VM parameters.

Well, it would be nice to also work with "/arg" not just "--arg" on the Windows CMD shell. ;-)

Best,
Marcel

Am 08.06.2020 00:53:26 schrieb David T. Lewis <[hidden email]>:

Background:

Motivated by the recent "Image not startable after save" discussion, I was
looking to see if there might be a way to inject a startup patch script,
specified as a command line option, that would be evaluated prior to any
of the processStartupList: processing.

That led me to notice that '--' handling in the command line is currently
broken in trunk (see the -help message from the VM for intended behavior).

So I fixed that, but then I noticed that argument processing was generally
all bolloxed up and worked inconsistently.

So I fixed that, thinking that I might put it in the inbox. But then I
noticed that VM is also inconsistent with respect to passing the '--' token
to the image in the VM parameters.

So I fixed that, thinking that I could do a pull request to opensmalltalk-vm
to get that resolved.

Then in noticed that the readDocumentAtStartup preference serves no useful
purpose once the argument processing is fixed, so I eliminated use of the
preference, so it can be deprecated and removed at a later date.

But in total, this is a a lot of change to something that never working
right in the first place, so I would like to summarize my idea of how of
I think it *should* work for review before posting any code. If this
seems reasonable, I'll put my changes in the inbox.

Here is how I think it should work, showing first a unix command line, and
then the expect arguments as seen in the image:

$ squeak squeak.image -- arg1 arg2 arg3 "do not treat arg1 as a script"
Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
(1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)

$ squeak squeak.image script.st -- arg1 arg2 arg3 "script.st runs"
Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
(1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)

$ squeak squeak.image -- script.st arg1 arg2 arg3 "script.st does not run"
Smalltalk arguments ==> #('start.st' 'arg1' 'arg2' 'arg3')
(1 to: 5) collect: [:i | Smalltalk argumentAt: i ] ==> #('start.st' 'arg1' 'arg2' 'arg3' nil)

$ squeak squeak.image script.st arg1 arg2 arg3 "script.st runs"
Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
(1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)

$ squeak script.st arg1 arg2 arg3 ==> error image not found

$ squeak script.st -- arg1 arg2 arg3 ==> error image not found

"Preferred bahaviour, but VM patch required"
$ squeak -- arg1 arg2 arg3 "use default image name squeak.image and no start script"
Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
(1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)

"Without VM patch, this is the expected (but not desirable) behavior, because
the VM fails to provide '--' in vmParameters in this case only. Note that the
current workaround is simply to specify the image name on the command line."
$ squeak -- arg1 arg2 arg3 ==> Error: no content to install
Smalltalk arguments ==> #('arg2' 'arg3')
(1 to: 4) collect: [:i | Smalltalk argumentAt: i ] #('arg2' 'arg3' nil nil)

Dave



Reply | Threaded
Open this post in threaded view
|

Re: Improving command line argument processing

David T. Lewis
Hi Marcel,

> Well, it would be nice to also work with "/arg" not just "--arg" on the Windows CMD shell. ;-)

I think that is a slightly different topic. The '--' that I was referring
to is a token (with whitespace around the '--' string) that can appear
in the middle of a command line. I think this usage is based on some
old unix conventions, and it basically tells the VM executable to stop
processing arguments and pass everything else remaining on the command
line to the image itself.

This convention allows you to explicitly separate the VM parameters (the
ones before the '--' token) from the remaining parameters that should be
passed along to the image. It also allows you to write a command line that
tells the VM "do not treat the next argument as a start script, just pass
it along to the image". And that in turn is why I think that the
readDocumentAtStartup preference is not longer required once we clean
up the argument processing issues.

I just took a look at the Windows and Unix VMs, and they both do a very
similar job of parsing VM and image parameters. But the Windows VM does
*not* have the bug that I mentioned, so if we want to fix that bug we
would only need to address it in the unix VM.

With respect to arguments identified by '-a value' or '--arg value' or
'/arg value'' I think that might be an issue for the VM itself. If you
wanted the Windows VM to accept VM arguments like '/arg', that would be
something to implement in the Windows VM. And if you wanted to so something
similar in image for handling the parameters passed to the image, you
could do that also (at a later time I guess).

For parsing argument lists in the image, I did some googling and was
reminded that Ian Piumarta apparently implemented a Smalltalk version
of the ancient unix standard getopt(). That would be a reasonable thing
for us to include in the image, but I don't know what became of the
actual code.

Dave


On Mon, Jun 08, 2020 at 11:49:24AM +0200, Marcel Taeumel wrote:

> Hi Dave!
>
> Thanks for looking into this! :-)
>
> >??So I fixed that, thinking that I might put it in the inbox. But then I
> noticed that VM is also inconsistent with respect to passing the '--' token
> to the image in the VM parameters.
>
> Well, it would be nice to also work with "/arg" not just "--arg" on the Windows CMD shell. ;-)
>
> Best,
> Marcel
> Am 08.06.2020 00:53:26 schrieb David T. Lewis <[hidden email]>:
> Background:
>
> Motivated by the recent "Image not startable after save" discussion, I was
> looking to see if there might be a way to inject a startup patch script,
> specified as a command line option, that would be evaluated prior to any
> of the processStartupList: processing.
>
> That led me to notice that '--' handling in the command line is currently
> broken in trunk (see the -help message from the VM for intended behavior).
>
> So I fixed that, but then I noticed that argument processing was generally
> all bolloxed up and worked inconsistently.
>
> So I fixed that, thinking that I might put it in the inbox. But then I
> noticed that VM is also inconsistent with respect to passing the '--' token
> to the image in the VM parameters.
>
> So I fixed that, thinking that I could do a pull request to opensmalltalk-vm
> to get that resolved.
>
> Then in noticed that the readDocumentAtStartup preference serves no useful
> purpose once the argument processing is fixed, so I eliminated use of the
> preference, so it can be deprecated and removed at a later date.
>
> But in total, this is a a lot of change to something that never working
> right in the first place, so I would like to summarize my idea of how of
> I think it *should* work for review before posting any code. If this
> seems reasonable, I'll put my changes in the inbox.
>
> Here is how I think it should work, showing first a unix command line, and
> then the expect arguments as seen in the image:
>
> $ squeak squeak.image -- arg1 arg2 arg3 "do not treat arg1 as a script"
> Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
> (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)
>
> $ squeak squeak.image script.st -- arg1 arg2 arg3 "script.st runs"
> Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
> (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)
>
> $ squeak squeak.image -- script.st arg1 arg2 arg3 "script.st does not run"
> Smalltalk arguments ==> #('start.st' 'arg1' 'arg2' 'arg3')
> (1 to: 5) collect: [:i | Smalltalk argumentAt: i ] ==> #('start.st' 'arg1' 'arg2' 'arg3' nil)
>
> $ squeak squeak.image script.st arg1 arg2 arg3 "script.st runs"
> Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
> (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)
>
> $ squeak script.st arg1 arg2 arg3 ==> error image not found
>
> $ squeak script.st -- arg1 arg2 arg3 ==> error image not found
>
> "Preferred bahaviour, but VM patch required"
> $ squeak -- arg1 arg2 arg3 "use default image name squeak.image and no start script"
> Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
> (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)
>
> "Without VM patch, this is the expected (but not desirable) behavior, because
> the VM fails to provide '--' in vmParameters in this case only. Note that the
> current workaround is simply to specify the image name on the command line."
> $ squeak -- arg1 arg2 arg3 ==> Error: no content to install
> Smalltalk arguments ==> #('arg2' 'arg3')
> (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] #('arg2' 'arg3' nil nil)
>
> Dave
>

>


Reply | Threaded
Open this post in threaded view
|

Re: Improving command line argument processing

K K Subbu
In reply to this post by David T. Lewis
On 08/06/20 4:23 am, David T. Lewis wrote:
>    $ squeak squeak.image -- arg1 arg2 arg3 "do not treat arg1 as a script"

This should result in an error. Unix convention is to prefix - or -- to
options and therefore -- is a special marker to terminate option
processing and allow arguments beginning with -. So you can type "grep
-- -v" to search for -v in files. In our case, squeak.image does not
begin with - and so option processing stops automatically. Anything
after the image file becomes the arguments for the image and not the
platform vm. So there is no need to use -- for disambiguation.

>    $ squeak script.st arg1 arg2 arg3 ==> error image not found

On Unix, the env variable SQUEAK_IMAGE is checked if the first argument
is not an image file. If the variable is not set, then the image file is
assumed to be "squeak.image". So this will be a valid invocation:

      $ squeak ${SQUEAK_IMAGE:-squeak.image} script.st arg1 arg2 arg3

Regards .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: Improving command line argument processing

David T. Lewis
Hi Subbu,

On Tue, Jun 09, 2020 at 12:56:03AM +0530, K K Subbu wrote:

> On 08/06/20 4:23 am, David T. Lewis wrote:
> >   $ squeak squeak.image -- arg1 arg2 arg3 "do not treat arg1 as a script"
>
> This should result in an error. Unix convention is to prefix - or -- to
> options and therefore -- is a special marker to terminate option
> processing and allow arguments beginning with -. So you can type "grep
> -- -v" to search for -v in files. In our case, squeak.image does not
> begin with - and so option processing stops automatically. Anything
> after the image file becomes the arguments for the image and not the
> platform vm. So there is no need to use -- for disambiguation.
>
> >   $ squeak script.st arg1 arg2 arg3 ==> error image not found
>
> On Unix, the env variable SQUEAK_IMAGE is checked if the first argument
> is not an image file. If the variable is not set, then the image file is
> assumed to be "squeak.image". So this will be a valid invocation:
>
>      $ squeak ${SQUEAK_IMAGE:-squeak.image} script.st arg1 arg2 arg3
>

Exactly so. The '--' terminates further processing of the argument list,
and passes the remainder to the image for further interpretation.

According to the -help message from the VM:

$ squeak -help
Usage: /usr/local/lib/squeak/4.19.1-3780/squeakvm [<option>...] [<imageName> [<argument>...]]
       /usr/local/lib/squeak/4.19.1-3780/squeakvm [<option>...] -- [<argument>...]

And a note toward the end of the help message says:

  Precede <arguments> by `--' to use default image.

The above does not tell you explicitly how to parse this:

  $ squeak squeak.image -- arg1 arg2 arg3
 
But if you consider that the VM is parsing the command line in the normal
manner up to the '--' token, and then passing the rest on the the image
as arguments, then the behavior seems quite reasonable.

The need for '--' for disambiguation remains, partly because of the Squeak
convention of allowing an optional start script to be specified after the
image name.  You need to have some way of determining if 'arg1' was intended
as an argument for the image to process, or if it was intended as the name
of a script document to be evaluated at image startup time.

Dave
 

Reply | Threaded
Open this post in threaded view
|

Re: Improving command line argument processing

David T. Lewis
In reply to this post by David T. Lewis
I put the changes in the inbox as System-dtl.1164 for review.

Dave

On Sun, Jun 07, 2020 at 06:53:17PM -0400, David T. Lewis wrote:

> Background:
>
> Motivated by the recent "Image not startable after save" discussion, I was
> looking to see if there might be a way to inject a startup patch script,
> specified as a command line option, that would be evaluated prior to any
> of the processStartupList: processing.
>
> That led me to notice that '--' handling in the command line is currently
> broken in trunk (see the -help message from the VM for intended behavior).
>
> So I fixed that, but then I noticed that argument processing was generally
> all bolloxed up and worked inconsistently.
>
> So I fixed that, thinking that I might put it in the inbox. But then I
> noticed that VM is also inconsistent with respect to passing the '--' token
> to the image in the VM parameters.
>
> So I fixed that, thinking that I could do a pull request to opensmalltalk-vm
> to get that resolved.
>
> Then in noticed that the readDocumentAtStartup preference serves no useful
> purpose once the argument processing is fixed, so I eliminated use of the
> preference, so it can be deprecated and removed at a later date.
>
> But in total, this is a a lot of change to something that never working
> right in the first place, so I would like to summarize my idea of how of
> I think it *should* work for review before posting any code. If this
> seems reasonable, I'll put my changes in the inbox.
>
> Here is how I think it should work, showing first a unix command line, and
> then the expect arguments as seen in the image:
>
>   $ squeak squeak.image -- arg1 arg2 arg3 "do not treat arg1 as a script"
>   Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
>   (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)
>
>   $ squeak squeak.image script.st -- arg1 arg2 arg3 "script.st runs"
>   Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
>   (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)
>
>   $ squeak squeak.image -- script.st arg1 arg2 arg3 "script.st does not run"
>   Smalltalk arguments ==> #('start.st' 'arg1' 'arg2' 'arg3')
>   (1 to: 5) collect: [:i | Smalltalk argumentAt: i ] ==> #('start.st' 'arg1' 'arg2' 'arg3' nil)
>
>   $ squeak squeak.image script.st arg1 arg2 arg3 "script.st runs"
>   Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
>   (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)
>
>   $ squeak script.st arg1 arg2 arg3 ==> error image not found
>
>   $ squeak script.st -- arg1 arg2 arg3 ==> error image not found
>
>   "Preferred bahaviour, but VM patch required"
>   $ squeak -- arg1 arg2 arg3 "use default image name squeak.image and no start script"
>   Smalltalk arguments ==> #('arg1' 'arg2' 'arg3')
>   (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] ==> #('arg1' 'arg2' 'arg3' nil)
>  
>   "Without VM patch, this is the expected (but not desirable) behavior, because
>   the VM fails to provide '--' in vmParameters in this case only. Note that the
>   current workaround is simply to specify the image name on the command line."
>   $ squeak -- arg1 arg2 arg3 ==> Error: no content to install
>   Smalltalk arguments ==> #('arg2' 'arg3')
>   (1 to: 4) collect: [:i | Smalltalk argumentAt: i ] #('arg2' 'arg3' nil nil)
>
> Dave

Reply | Threaded
Open this post in threaded view
|

Re: Improving command line argument processing

K K Subbu
In reply to this post by David T. Lewis
On 09/06/20 4:56 am, David T. Lewis wrote:
> According to the -help message from the VM:
>
> $ squeak -help
> Usage: /usr/local/lib/squeak/4.19.1-3780/squeakvm [<option>...] [<imageName> [<argument>...]]
>         /usr/local/lib/squeak/4.19.1-3780/squeakvm [<option>...] -- [<argument>...]
>
> And a note toward the end of the help message says:
>
>    Precede <arguments> by `--' to use default image.

Ha! I now see the cause for confusion. Here arguments refers to image's
arguments not vm's options.

>
> The above does not tell you explicitly how to parse this:
>
>    $ squeak squeak.image -- arg1 arg2 arg3
>  
> But if you consider that the VM is parsing the command line in the normal
> manner up to the '--' token, and then passing the rest on the the image
> as arguments, then the behavior seems quite reasonable.
The vm option parsing stops when at the first word which does not begin
with - or -- *which ever occurs earlier*. In your example, vm options
will terminate at squeak.image and not see the -- coming after it. The
-- will parse as a image arg word and passed as such to the image.

As best as I could make out, the usage syntax is:

    <vm> <vmopt>* [--] [<file.image>] <image-args>*
    <vmopt> = -<letter> | --<word>

The [--] is needed to mark the end of <vmopt> only when <image> or the
first <image-arg> begins with - (unlikely in interactive but possible in
shell scripts). The vm does check if the first non-option word ends in
*.image. If not, it uses ${SQUEAK_IMAGE:-squeak.image} and passes the
remaining args to this image.

> The need for '--' for disambiguation remains, partly because of the Squeak
> convention of allowing an optional start script to be specified after the
> image name.  You need to have some way of determining if 'arg1' was intended
> as an argument for the image to process, or if it was intended as the name
> of a script document to be evaluated at image startup time.

Image arg parsing is handled within the image and does not have to use
the Unix convention. I wouldn't recommend it because Squeak Image runs
on multiple platforms. The convention would be to check if the first arg
is a registered command word (e.g. eval st) or a chunk file (*.st) and
then let the command/script parse the rest of the arguments.

Regards .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: Improving command line argument processing

David T. Lewis
Hi Subbu,

One follow up note on this topic, slightly off topic - the vmrun
shell script that I have previously shared on the list has a bug
that affects the command line arguments. On the last line of the
script, the argument list needs to be enclused in quotes, otherwise
it will not let you pass a command line parameter that contains
a space character.

I am attaching a fixed version. The only change is on the last
line of the shell script:

  exec ${VM} "$@"

Dave


On Tue, Jun 09, 2020 at 11:21:21AM +0530, K K Subbu wrote:

> On 09/06/20 4:56 am, David T. Lewis wrote:
> >According to the -help message from the VM:
> >
> >$ squeak -help
> >Usage: /usr/local/lib/squeak/4.19.1-3780/squeakvm [<option>...]
> >[<imageName> [<argument>...]]
> >        /usr/local/lib/squeak/4.19.1-3780/squeakvm [<option>...] --
> >        [<argument>...]
> >
> >And a note toward the end of the help message says:
> >
> >   Precede <arguments> by `--' to use default image.
>
> Ha! I now see the cause for confusion. Here arguments refers to image's
> arguments not vm's options.
>
> >
> >The above does not tell you explicitly how to parse this:
> >
> >   $ squeak squeak.image -- arg1 arg2 arg3
> >  
> >But if you consider that the VM is parsing the command line in the normal
> >manner up to the '--' token, and then passing the rest on the the image
> >as arguments, then the behavior seems quite reasonable.
> The vm option parsing stops when at the first word which does not begin
> with - or -- *which ever occurs earlier*. In your example, vm options
> will terminate at squeak.image and not see the -- coming after it. The
> -- will parse as a image arg word and passed as such to the image.
>
> As best as I could make out, the usage syntax is:
>
>    <vm> <vmopt>* [--] [<file.image>] <image-args>*
>    <vmopt> = -<letter> | --<word>
>
> The [--] is needed to mark the end of <vmopt> only when <image> or the
> first <image-arg> begins with - (unlikely in interactive but possible in
> shell scripts). The vm does check if the first non-option word ends in
> *.image. If not, it uses ${SQUEAK_IMAGE:-squeak.image} and passes the
> remaining args to this image.
>
> >The need for '--' for disambiguation remains, partly because of the Squeak
> >convention of allowing an optional start script to be specified after the
> >image name.  You need to have some way of determining if 'arg1' was
> >intended
> >as an argument for the image to process, or if it was intended as the name
> >of a script document to be evaluated at image startup time.
>
> Image arg parsing is handled within the image and does not have to use
> the Unix convention. I wouldn't recommend it because Squeak Image runs
> on multiple platforms. The convention would be to check if the first arg
> is a registered command word (e.g. eval st) or a chunk file (*.st) and
> then let the command/script parse the rest of the arguments.
>
> Regards .. Subbu
>