How to use OSProcess with stdin and stdout

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
20 messages Options
Reply | Threaded
Open this post in threaded view
|

How to use OSProcess with stdin and stdout

Martin Kuball
Hi!

I'm trying to do some OCR from squeak using tesseract. I installed OSProcess.
So far so good. But I don't know what classes to use for stdIn and stdOut. My
code would look something like this:

| proc stdIn stdOut d |
stdIn := ???
stdOut := ???
proc := ExternalUnixOSProcess forkAndExec: '/usr/bin/tesseract' arguments:
#('-' '-' '--dpi' '100') environment: nil descriptors: (Array with: stdIn
with: stdOut with: nil).
proc ifNil: [self class noAccessorAvailable].
d := Delay forMilliseconds: 50.
[proc runState == #complete] whileFalse: [d wait].
" and now read the text from stdOut..."

Can someone fill in the blanks or point me to code that does similar things?
Thanks very much.

Martin




Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

David T. Lewis
On Thu, May 14, 2020 at 09:37:26PM +0200, Martin Kuball wrote:

> Hi!
>
> I'm trying to do some OCR from squeak using tesseract. I installed OSProcess.
> So far so good. But I don't know what classes to use for stdIn and stdOut. My
> code would look something like this:
>
> | proc stdIn stdOut d |
> stdIn := ???
> stdOut := ???
> proc := ExternalUnixOSProcess forkAndExec: '/usr/bin/tesseract' arguments:
> #('-' '-' '--dpi' '100') environment: nil descriptors: (Array with: stdIn
> with: stdOut with: nil).
> proc ifNil: [self class noAccessorAvailable].
> d := Delay forMilliseconds: 50.
> [proc runState == #complete] whileFalse: [d wait].
> " and now read the text from stdOut..."
>
> Can someone fill in the blanks or point me to code that does similar things?
> Thanks very much.

Hi Martin,

First, please also install CommandShell in addition to OSProcess. Get the
latest versions of both OSProcess and CommandShell, regardless of the version
of Squeak you are using. If you are using SqueakMap to load them, then please
select the versions labelled "(head)".

Start out by trying something like this:

    OSProcess outputOf: 'tesseract - - --dpi 100'

I'm not sure if this will do what you want but please give it a try,
and if it does not work I'll try to give a better answer.

This uses a couple of new methods that I added to OSPrecess recently,
but have not mentioned until now. If it proves to be useful you, you
will be the first :-)

Assuming that it works, here is what will have happened:

- The argument string is parsed into a unix-style command pipeline

- The pipeline is all objects, with OS process proxies doing the work

- When evaluated, and stderr result will show up in an error notifier
  in your image (proceed though the notifier)

- Command stdout is collected and answered as the result of #outputOf:

I would recommend running this in a debugger so you can step through
it and see what is going on.

Dave

>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

stes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256


If one ignores the message when selecting "head" , then that's a risk no ?

"The package you are about to install is not listed as being compatible
with your image version (Squeak5.3), so the package may not work properly.
Do you still want to proceed with the install Yes/No"

If you downloaded via "Squeak Map Catalog" (Package Loader) the "head"

  - OSProcess
  - CommandShell

Selecting "head" works but gives a warning that it is "potentially"
not compatible with the VM (or Image?).

In my case when I install (succesfully I think) I get:

 OSProcess versionString  prints : '4.6.19'
 CommandShell versionString prints: '4.7.10'

What I worry about is that I may be installing with "head",
is more recent than the one supported by the image or VM.

How can I check that they are compatible ?

Thanks,
David Stes

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJevnWAAAoJEAwpOKXMq1MaR7YH/RwZBZJh4nEJhbr6v6rs0f9W
pLaHSjc8VaZYgn49PygHLAxLtFt1iNTeUpaXYej6iu2fddN2fAdU9nyN/7T/9qx7
sf0f/IpXshjS5GHsU7KeupLsWkAVlr/NwaQNTmgOMX3O4MYfYuEFDtuoZX1GoZlO
fF+AOatRnMioSWhoKuHx1yc9UJ8+lX15EQTCFq3iI4FFFG++JYA+xQj5woUG1DqY
w3vwgA5U5kXFBa3rW/Dr9088F6GBxKggPW8E3XusAXGEOXD+4yiBGnV+JRqW2xWw
ZVqGA7UVVFiF0gfJ+vR0eTAGx6O4sSnasizvS+4BEAGe8TO5yLlRTwJau71vyaE=
=r0uf
-----END PGP SIGNATURE-----




--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

stes
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA256


I was asking myself the question on "install the 'head' version",
because when I read that you can "always" install the 'head' version,
I was thinking that for a class like OSProcess, it could matter ...

If I'm not mistaken the VM has some "primitives" built in, those
primitives should match the class that is installed, I guess.

Most likely you would encounter errors like "primitive" not existing
or similar, but nevertheless I wonder whether there is a way to see,
whether the primitives that are provided by the VM (or image?),
are "compatible" with the OSProcess and CommandShell that one installs.

For example is there a way to see what version of the primitives
for "UnixOSProcessPlugin" is loaded in the VM ?

It's not very appealing to just "try" and experimentally see whether it
works,
as it will not be obvious whether any problem is due to a mismatch of a
wrong version of a class installed, or due to some other issue.

David Stes


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iQEcBAEBCAAGBQJevqtZAAoJEAwpOKXMq1MachQIAKxs79hlUCaNTVBgTu0r+uik
WJqriErOuLd0Fu1gP8fATSf9BybAn/p1dqq9VAi4qXRHeVin9akb6kmSJqKyaj9l
X81V+l5tICFePEE60lzmE/oRyoYr4eXrnQD64/RyS9CfUgeLpIzXp/KC+p/PIIFM
iHGNWuPKYbvrJoDx6exvn4dwIEE/dmgK2BcDoFIGl+I+bGvhhIOjfqUpgIekZ581
xFTyoQlP2QzIFAj36Mz6VzK9qUuTM6Rcqgi/IsOngEefT1V8kjQs9N847tASg2qB
gPIxbzuAqQo6HwT1rA1j1QKGq9FcBbVNTI9/07Mk63qZmk1/vG1dAepOyrUcv0I=
=vUmH
-----END PGP SIGNATURE-----




--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Ben Coman
In reply to this post by Martin Kuball


On Fri, 15 May 2020 at 03:37, Martin Kuball <[hidden email]> wrote:
Hi!

I'm trying to do some OCR from squeak using tesseract. I installed OSProcess.

Another option would be calling the library functions directly using FFI.  I notice...

cheers -ben


Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

K K Subbu
In reply to this post by stes
On 15/05/20 8:17 pm, stes wrote:
> Most likely you would encounter errors like "primitive" not existing
> or similar, but nevertheless I wonder whether there is a way to see,
> whether the primitives that are provided by the VM (or image?),
> are "compatible" with the OSProcess and CommandShell that one installs.
>
> For example is there a way to see what version of the primitives
> for "UnixOSProcessPlugin" is loaded in the VM ?

You can get the names of the loaded modules with:

   Smalltalk listLoadedModules (old)
or
   SmalltalkImage current listLoadedModules

You can also see these in "About Squeak" -> VM Modules section.

AFAIK, the modules themselves, being in machine code, are not reified as
objects that can be inspected in Squeak. They are private to VM.

HTH .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

David T. Lewis
In reply to this post by stes
Hi David,

I'm afraid you will just have to take my word for it :-)

SqueakMap is nice because it makes the packages easily findable,
but a long-standing annoyance is that I have no way of expressing
the version compatibility. The warnings from SqueakMap look alarming,
but I don't know how to fix that.

Bottom line: For any version of Squeak that has been released in
the last ten years or so, please use the latest version of OSProcess
and CommmandShell.

Dave


On Fri, May 15, 2020 at 05:57:54AM -0500, stes wrote:

>
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
>
> If one ignores the message when selecting "head" , then that's a risk no ?
>
> "The package you are about to install is not listed as being compatible
> with your image version (Squeak5.3), so the package may not work properly.
> Do you still want to proceed with the install Yes/No"
>
> If you downloaded via "Squeak Map Catalog" (Package Loader) the "head"
>
>   - OSProcess
>   - CommandShell
>
> Selecting "head" works but gives a warning that it is "potentially"
> not compatible with the VM (or Image?).
>
> In my case when I install (succesfully I think) I get:
>
>  OSProcess versionString  prints : '4.6.19'
>  CommandShell versionString prints: '4.7.10'
>
> What I worry about is that I may be installing with "head",
> is more recent than the one supported by the image or VM.
>
> How can I check that they are compatible ?
>
> Thanks,
> David Stes
>

Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Martin Kuball
In reply to this post by David T. Lewis
Am Donnerstag, 14. Mai 2020, 23:27:59 CEST schrieb David T. Lewis:

> On Thu, May 14, 2020 at 09:37:26PM +0200, Martin Kuball wrote:
> > Hi!
> >
> > I'm trying to do some OCR from squeak using tesseract. I installed
> > OSProcess. So far so good. But I don't know what classes to use for stdIn
> > and stdOut. My>
> > code would look something like this:
> > | proc stdIn stdOut d |
> >
> > stdIn := ???
> > stdOut := ???
> > proc := ExternalUnixOSProcess forkAndExec: '/usr/bin/tesseract' arguments:
> > #('-' '-' '--dpi' '100') environment: nil descriptors: (Array with: stdIn
> > with: stdOut with: nil).
> > proc ifNil: [self class noAccessorAvailable].
> > d := Delay forMilliseconds: 50.
> > [proc runState == #complete] whileFalse: [d wait].
> > " and now read the text from stdOut..."
> >
> > Can someone fill in the blanks or point me to code that does similar
> > things? Thanks very much.
>
> Hi Martin,
>
> First, please also install CommandShell in addition to OSProcess. Get the
> latest versions of both OSProcess and CommandShell, regardless of the
> version of Squeak you are using. If you are using SqueakMap to load them,
> then please select the versions labelled "(head)".
>
> Start out by trying something like this:
>
>     OSProcess outputOf: 'tesseract - - --dpi 100'
>
> I'm not sure if this will do what you want but please give it a try,
> and if it does not work I'll try to give a better answer.
>
> This uses a couple of new methods that I added to OSPrecess recently,
> but have not mentioned until now. If it proves to be useful you, you
> will be the first :-)
>
> Assuming that it works, here is what will have happened:
>
> - The argument string is parsed into a unix-style command pipeline
>
> - The pipeline is all objects, with OS process proxies doing the work
>
> - When evaluated, and stderr result will show up in an error notifier
>   in your image (proceed though the notifier)
>
> - Command stdout is collected and answered as the result of #outputOf:
>
> I would recommend running this in a debugger so you can step through
> it and see what is going on.
>
> Dave

Hi Dave,

thanks for your answer. Acutally I did install CommandShell because I thougth
it might help me understand the usage of OSProcess. And maybe it will if I
give it more time.

So here is what I did: I configure the following repository:

MCHttpRepository
        location: 'http://www.squeaksource.com/OSProcess'
        user: ''
        password: ''

and installed OSProcess-Base dtl.71, OSProcess-AIO dlt.9 and OSProcess-Unix
dtl.35. But I do not see any mention of a head label. So where did I go wrong?

The command you suggested worked. At least if I provide the image to tesseract
as a file. I will go on with the debugger and try to find out how to feed the
image data to stdIn.

Martin




Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Martin Kuball
In reply to this post by Ben Coman
Am Freitag, 15. Mai 2020, 18:25:58 CEST schrieb Ben Coman:

> On Fri, 15 May 2020 at 03:37, Martin Kuball <[hidden email]> wrote:
> > Hi!
> >
> > I'm trying to do some OCR from squeak using tesseract. I installed
> > OSProcess.
>
> Another option would be calling the library functions directly using FFI.
> I notice...
> https://github.com/ottopedi/Squeak_Tesseract
>
> cheers -ben

The readme says it's on windows 10. So it will not work on linux out of the
box, or will it?

Martin




Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

David T. Lewis
In reply to this post by Martin Kuball
Hi Martin,

On Fri, May 15, 2020 at 09:20:17PM +0200, Martin Kuball wrote:

> Am Donnerstag, 14. Mai 2020, 23:27:59 CEST schrieb David T. Lewis:
> > On Thu, May 14, 2020 at 09:37:26PM +0200, Martin Kuball wrote:
> > > Hi!
> > >
> > > I'm trying to do some OCR from squeak using tesseract. I installed
> > > OSProcess. So far so good. But I don't know what classes to use for stdIn
> > > and stdOut. My>
> > > code would look something like this:
> > > | proc stdIn stdOut d |
> > >
> > > stdIn := ???
> > > stdOut := ???
> > > proc := ExternalUnixOSProcess forkAndExec: '/usr/bin/tesseract' arguments:
> > > #('-' '-' '--dpi' '100') environment: nil descriptors: (Array with: stdIn
> > > with: stdOut with: nil).
> > > proc ifNil: [self class noAccessorAvailable].
> > > d := Delay forMilliseconds: 50.
> > > [proc runState == #complete] whileFalse: [d wait].
> > > " and now read the text from stdOut..."
> > >
 > > > Can someone fill in the blanks or point me to code that does similar

> > > things? Thanks very much.
> >
> > Hi Martin,
> >
> > First, please also install CommandShell in addition to OSProcess. Get the
> > latest versions of both OSProcess and CommandShell, regardless of the
> > version of Squeak you are using. If you are using SqueakMap to load them,
> > then please select the versions labelled "(head)".
> >
> > Start out by trying something like this:
> >
> >     OSProcess outputOf: 'tesseract - - --dpi 100'
> >
> > I'm not sure if this will do what you want but please give it a try,
> > and if it does not work I'll try to give a better answer.
> >
> > This uses a couple of new methods that I added to OSPrecess recently,
> > but have not mentioned until now. If it proves to be useful you, you
> > will be the first :-)
> >
> > Assuming that it works, here is what will have happened:
> >
> > - The argument string is parsed into a unix-style command pipeline
> >
> > - The pipeline is all objects, with OS process proxies doing the work
> >
> > - When evaluated, and stderr result will show up in an error notifier
> >   in your image (proceed though the notifier)
> >
> > - Command stdout is collected and answered as the result of #outputOf:
> >
> > I would recommend running this in a debugger so you can step through
> > it and see what is going on.
> >
> > Dave
>
> Hi Dave,
>
> thanks for your answer. Acutally I did install CommandShell because I thougth
> it might help me understand the usage of OSProcess. And maybe it will if I
> give it more time.
>
> So here is what I did: I configure the following repository:
>
> MCHttpRepository
> location: 'http://www.squeaksource.com/OSProcess'
> user: ''
> password: ''
>
> and installed OSProcess-Base dtl.71, OSProcess-AIO dlt.9 and OSProcess-Unix
> dtl.35. But I do not see any mention of a head label. So where did I go wrong?
>

From the OSProcess repository, load OSProcess-dtl.118. From the CommandShell
repository, load CommandShell-dtl.109. These are currently the most recent
versions. Ignore the sub-packages such as "OSProcess-Unix", that is something
I did to support Pharo (something of a fools errand if I may say so). All of
the sub-packages are included in the full OSProcess and CommandShell packages.

A shortcut to do this is:

        Installer ss project: 'OSProcess'; install: 'OSProcess'.
        Installer ss project: 'CommandShell'; install: 'CommandShell'.

The "(head)" version labels in the SqueakMap package loader do the same
thing, except for the alarming warning messages which you can safely ignore.


> The command you suggested worked. At least if I provide the image to tesseract
> as a file. I will go on with the debugger and try to find out how to feed the
> image data to stdIn.

The basic Unix shell redirector operators #> and #< should work. For example,
try evaluating this:

   OSProcess outputOf: 'cat < /etc/services | edit'

I am not familiar with tesseract, but you can probably use the same approach.

Dave


Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Chris Muller-3
In reply to this post by David T. Lewis
Hi Martin,

> I don't know what classes to use for stdIn and stdOut.

Squeak provides access to stdin and stdout from the FileStream class, even without OSProcess loaded.

> But I do not see any mention of a head label. So where did I go wrong?

The head label is a naming convention used on some SqueakMap packages to build a developer's workstation to work on that package.  It usually means "load the latest code" including all tests and tools packages.  It's useful early on in a project, but when finally deploying your own app, you will definitely want to specify a fixed version, and _not_ the head version, otherwise your package could easily rot.

> SqueakMap is nice because it makes the packages easily findable,
> but a long-standing annoyance is that I have no way of expressing
> the version compatibility.

The above needs a clarification.  SqueakMap is widely misunderstood to be an SCM tool.  It's not, it's actually Squeak's App Store.  It's 100% about letting Publishers define working software Releases (including specifying the "version compatibility" -- which version of Squeak they were tested on) that can then be consumed with "one click" by Users, while providing a good UX for both.  I'm planning a revamp of it later this year.

> The warnings from SqueakMap look alarming,

This is the message Dave is referring to.

alarming-SM-message.png

The head version is the developers version.  When developers intending to work on a package see this message, they already know about potential compatibility issues.  The message is correct.  If it feels alarming, it means you should select the one listed under the "safely-available" filter.

Best,
  Chris


Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

K K Subbu
On 18/05/20 5:22 am, Chris Muller wrote:
> The head version is the developers version.  When developers intending
> to work on a package see this message, they already know about potential
> compatibility issues.  The message is correct.  If it feels alarming, it
> means you should select the one listed under the "safely-available" filter.

When you revamp the code later, could you also reword the message to
make this explicit? Perhaps something like,

"The package you are about to install has not yet been tested for
compatibility with your image version (Squeak 5.2). You may want to
select from the packages listed under "safely-available" filter.
Do you still want to proceed with the install?"

Regards .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Chris Muller-3
Good idea, absolutely.  My plans for the revamp are initially for a new backend that provides a simple API that will eventually enable both a web AND new ToolBuilder interface.

 - Chris

On Mon, May 18, 2020 at 4:34 AM K K Subbu <[hidden email]> wrote:
On 18/05/20 5:22 am, Chris Muller wrote:
> The head version is the developers version.  When developers intending
> to work on a package see this message, they already know about potential
> compatibility issues.  The message is correct.  If it feels alarming, it
> means you should select the one listed under the "safely-available" filter.

When you revamp the code later, could you also reword the message to
make this explicit? Perhaps something like,

"The package you are about to install has not yet been tested for
compatibility with your image version (Squeak 5.2). You may want to
select from the packages listed under "safely-available" filter.
Do you still want to proceed with the install?"

Regards .. Subbu



Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Martin Kuball
In reply to this post by David T. Lewis
Hi David,

I finally found a solution to my problem. Successfully converted 1300 images in
a couple of minutes. The code I come up with is roughly the following (using
the wc program here as a proof of concept before switching to tesseract and
image data for the input):

| input output stdErr proc accessProtect res err |
input := ExternalPipe nonBlockingPipe.
output := ExternalPipe blockingPipe.
stdErr := ExternalPipe nonBlockingPipe.
proc := ExternalOSProcess concreteClass
        programName: '/usr/bin/wc'
        arguments: #('-w')
        initialEnvironment: nil.
proc initialStdIn: input reader.
proc initialStdOut: output writer.
proc initialStdErr: stdErr writer.
accessProtect := Semaphore forMutualExclusion.
accessProtect critical: [
        proc value.
        input nextPutAll: 'this is a test'.
        input closeWriter.
        res := output upToEndOfFile.
        err := stdErr upToEndOfFile
].
proc waitForTermination.
output close.
stdErr close.
proc exitStatus ~= 0
        ifTrue: [^ 'error ' , proc exitStatus , ': ' , err].
        ifFalse: [^res]


Mybe it is possible to reuse more of your high-level code. But I wanted low
overhead and complete control over the input and output streams.

By the way uninstalling an older version of OSProcess and installing the new
one left a OSProcess watcher with an ObsoleteUnixOSProcessAccessor running.

And a final question: If you want to get a deeper understanding of process
handling in squeak, what would you recomend to read? Is the blue book still a
good reference?

Thanks again for your help and the great code.

Martin



Am Samstag, 16. Mai 2020, 00:13:19 CEST schrieb David T. Lewis:

> Hi Martin,
>
> On Fri, May 15, 2020 at 09:20:17PM +0200, Martin Kuball wrote:
> > Am Donnerstag, 14. Mai 2020, 23:27:59 CEST schrieb David T. Lewis:
> > > On Thu, May 14, 2020 at 09:37:26PM +0200, Martin Kuball wrote:
> > > > Hi!
> > > >
> > > > I'm trying to do some OCR from squeak using tesseract. I installed
> > > > OSProcess. So far so good. But I don't know what classes to use for
> > > > stdIn
> > > > and stdOut. My>
> > > >
> > > > code would look something like this:
> > > > | proc stdIn stdOut d |
> > > >
> > > > stdIn := ???
> > > > stdOut := ???
> > > > proc := ExternalUnixOSProcess forkAndExec: '/usr/bin/tesseract'
> > > > arguments:
> > > > #('-' '-' '--dpi' '100') environment: nil descriptors: (Array with:
> > > > stdIn
> > > > with: stdOut with: nil).
> > > > proc ifNil: [self class noAccessorAvailable].
> > > > d := Delay forMilliseconds: 50.
> > > > [proc runState == #complete] whileFalse: [d wait].
> > > > " and now read the text from stdOut..."
> > > >
>  > > > Can someone fill in the blanks or point me to code that does similar
> > > >
> > > > things? Thanks very much.
> > >
> > > Hi Martin,
> > >
> > > First, please also install CommandShell in addition to OSProcess. Get
> > > the
> > > latest versions of both OSProcess and CommandShell, regardless of the
> > > version of Squeak you are using. If you are using SqueakMap to load
> > > them,
> > > then please select the versions labelled "(head)".
> > >
> > > Start out by trying something like this:
> > >     OSProcess outputOf: 'tesseract - - --dpi 100'
> > >
> > > I'm not sure if this will do what you want but please give it a try,
> > > and if it does not work I'll try to give a better answer.
> > >
> > > This uses a couple of new methods that I added to OSPrecess recently,
> > > but have not mentioned until now. If it proves to be useful you, you
> > > will be the first :-)
> > >
> > > Assuming that it works, here is what will have happened:
> > >
> > > - The argument string is parsed into a unix-style command pipeline
> > >
> > > - The pipeline is all objects, with OS process proxies doing the work
> > >
> > > - When evaluated, and stderr result will show up in an error notifier
> > >
> > >   in your image (proceed though the notifier)
> > >
> > > - Command stdout is collected and answered as the result of #outputOf:
> > >
> > > I would recommend running this in a debugger so you can step through
> > > it and see what is going on.
> > >
> > > Dave
> >
> > Hi Dave,
> >
> > thanks for your answer. Acutally I did install CommandShell because I
> > thougth it might help me understand the usage of OSProcess. And maybe it
> > will if I give it more time.
> >
> > So here is what I did: I configure the following repository:
> >
> > MCHttpRepository
> >
> > location: 'http://www.squeaksource.com/OSProcess'
> > user: ''
> > password: ''
> >
> > and installed OSProcess-Base dtl.71, OSProcess-AIO dlt.9 and
> > OSProcess-Unix
> > dtl.35. But I do not see any mention of a head label. So where did I go
> > wrong?
> From the OSProcess repository, load OSProcess-dtl.118. From the CommandShell
> repository, load CommandShell-dtl.109. These are currently the most recent
> versions. Ignore the sub-packages such as "OSProcess-Unix", that is
> something I did to support Pharo (something of a fools errand if I may say
> so). All of the sub-packages are included in the full OSProcess and
> CommandShell packages.
>
> A shortcut to do this is:
>
> Installer ss project: 'OSProcess'; install: 'OSProcess'.
> Installer ss project: 'CommandShell'; install: 'CommandShell'.
>
> The "(head)" version labels in the SqueakMap package loader do the same
> thing, except for the alarming warning messages which you can safely ignore.
> > The command you suggested worked. At least if I provide the image to
> > tesseract as a file. I will go on with the debugger and try to find out
> > how to feed the image data to stdIn.
>
> The basic Unix shell redirector operators #> and #< should work. For
> example, try evaluating this:
>
>    OSProcess outputOf: 'cat < /etc/services | edit'
>
> I am not familiar with tesseract, but you can probably use the same
> approach.
>
> Dave





Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

K K Subbu
On 23/05/20 2:05 am, Martin Kuball wrote:
> Hi David,
>
> I finally found a solution to my problem. Successfully converted 1300 images in
> a couple of minutes. The code I come up with is roughly the following (using
> the wc program here as a proof of concept before switching to tesseract and
> image data for the input):

Very nice! You could also look at waitForCommand: which spawns an
external process and waits for its completion.

> Mybe it is possible to reuse more of your high-level code. But I wanted low
> overhead and complete control over the input and output streams.

Squeak is not just an application running on the host. It is a whole
virtual machine. You have to think of the host as another node in a
network. Using stdin/stdout makes Squeak process a co-routine with the
host process and be prepared to handle SIGPIPE etc. You may not want
such close coupling.

Back in 2008-09, students (11-13yrs) wanted to use Etoys/Squeak to
practice Math and languages with complex scripts (Hindi, Kannada) that
only LaTeX supported. Simon Guest had written a LatexMorph for
processing simple LaTeX code on host. I improved it to handle complex
Indic scripts. Students would type LaTeX sentences in a text morph.
LatexMorph would save this string in a file, run latex and dvipng to
produce an image and read it back as an ImageMorph. It was a hack using
just OSProcess waitForCommand: and tmpfs (to avoid disk i/o) but it fast
enough for live LaTeX renders and even teachers took to it. We sure had
a lot of fun with it.

See https://squeaksource.com/LatexMorph (~ 500 lines of Squeak. The
operative LatexUnix class is only around 50 lines). I can send you the
changeset if you are interested.

HTH .. Subbu

Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

David T. Lewis
In reply to this post by Martin Kuball
Hi Martin,


On Fri, May 22, 2020 at 10:35:08PM +0200, Martin Kuball wrote:

> Hi David,
>
> I finally found a solution to my problem. Successfully converted 1300 images in
> a couple of minutes. The code I come up with is roughly the following (using
> the wc program here as a proof of concept before switching to tesseract and
> image data for the input):
>
> | input output stdErr proc accessProtect res err |
> input := ExternalPipe nonBlockingPipe.
> output := ExternalPipe blockingPipe.
> stdErr := ExternalPipe nonBlockingPipe.
> proc := ExternalOSProcess concreteClass
> programName: '/usr/bin/wc'
> arguments: #('-w')
> initialEnvironment: nil.
> proc initialStdIn: input reader.
> proc initialStdOut: output writer.
> proc initialStdErr: stdErr writer.
> accessProtect := Semaphore forMutualExclusion.
> accessProtect critical: [
> proc value.
> input nextPutAll: 'this is a test'.
> input closeWriter.
> res := output upToEndOfFile.
> err := stdErr upToEndOfFile
> ].
> proc waitForTermination.
> output close.
> stdErr close.
> proc exitStatus ~= 0
> ifTrue: [^ 'error ' , proc exitStatus , ': ' , err].
> ifFalse: [^res]
>

I'm glad it worked out for you!

>
> Mybe it is possible to reuse more of your high-level code. But I wanted low
> overhead and complete control over the input and output streams.


Indeed, it can be interesting to work with this at a lower level so you
can see exactly what is going on.


>
> By the way uninstalling an older version of OSProcess and installing the new
> one left a OSProcess watcher with an ObsoleteUnixOSProcessAccessor running.

This does not surprise me. That would just be the child process watcher that
was left running after removing all the OSProcess classes, so I guess you would
have to terminate the old process in that case.

>
> And a final question: If you want to get a deeper understanding of process
> handling in squeak, what would you recomend to read? Is the blue book still a
> good reference?

I think we are talking about two different things here. In Smalltalk, a
process is very lightweight, more like what you would call a thread in most
operating systems today. The blue book descriptions are still very relevant.
There have been some changes and improvements in Squeak over the years, but
the basics would still be the same.

The "processes" in OSProcess are a different thing entirely. These refer to
the processes of the underlying operating system. For an operating system
like Unix (Linux) or Windows, these processes are heavier-weight, and they
carry quite a lot of execution context in addition to the basic schedulable
unit of execution.

If you were to make a comparison, a Process in Squeak is like a "green thread"
in typical operating system lingo.

But to your question - yes I would start with the blue book, and then others
on this list (notably Eliot Miranda) can give explanations of the finer points
and the more recent chnages.

Dave


>
> Thanks again for your help and the great code.
>
> Martin
>
>
>
> Am Samstag, 16. Mai 2020, 00:13:19 CEST schrieb David T. Lewis:
> > Hi Martin,
> >
> > On Fri, May 15, 2020 at 09:20:17PM +0200, Martin Kuball wrote:
> > > Am Donnerstag, 14. Mai 2020, 23:27:59 CEST schrieb David T. Lewis:
> > > > On Thu, May 14, 2020 at 09:37:26PM +0200, Martin Kuball wrote:
> > > > > Hi!
> > > > >
> > > > > I'm trying to do some OCR from squeak using tesseract. I installed
> > > > > OSProcess. So far so good. But I don't know what classes to use for
> > > > > stdIn
> > > > > and stdOut. My>
> > > > >
> > > > > code would look something like this:
> > > > > | proc stdIn stdOut d |
> > > > >
> > > > > stdIn := ???
> > > > > stdOut := ???
> > > > > proc := ExternalUnixOSProcess forkAndExec: '/usr/bin/tesseract'
> > > > > arguments:
> > > > > #('-' '-' '--dpi' '100') environment: nil descriptors: (Array with:
> > > > > stdIn
> > > > > with: stdOut with: nil).
> > > > > proc ifNil: [self class noAccessorAvailable].
> > > > > d := Delay forMilliseconds: 50.
> > > > > [proc runState == #complete] whileFalse: [d wait].
> > > > > " and now read the text from stdOut..."
> > > > >
> >  > > > Can someone fill in the blanks or point me to code that does similar
> > > > >
> > > > > things? Thanks very much.
> > > >
> > > > Hi Martin,
> > > >
> > > > First, please also install CommandShell in addition to OSProcess. Get
> > > > the
> > > > latest versions of both OSProcess and CommandShell, regardless of the
> > > > version of Squeak you are using. If you are using SqueakMap to load
> > > > them,
> > > > then please select the versions labelled "(head)".
> > > >
> > > > Start out by trying something like this:
> > > >     OSProcess outputOf: 'tesseract - - --dpi 100'
> > > >
> > > > I'm not sure if this will do what you want but please give it a try,
> > > > and if it does not work I'll try to give a better answer.
> > > >
> > > > This uses a couple of new methods that I added to OSPrecess recently,
> > > > but have not mentioned until now. If it proves to be useful you, you
> > > > will be the first :-)
> > > >
> > > > Assuming that it works, here is what will have happened:
> > > >
> > > > - The argument string is parsed into a unix-style command pipeline
> > > >
> > > > - The pipeline is all objects, with OS process proxies doing the work
> > > >
> > > > - When evaluated, and stderr result will show up in an error notifier
> > > >
> > > >   in your image (proceed though the notifier)
> > > >
> > > > - Command stdout is collected and answered as the result of #outputOf:
> > > >
> > > > I would recommend running this in a debugger so you can step through
> > > > it and see what is going on.
> > > >
> > > > Dave
> > >
> > > Hi Dave,
> > >
> > > thanks for your answer. Acutally I did install CommandShell because I
> > > thougth it might help me understand the usage of OSProcess. And maybe it
> > > will if I give it more time.
> > >
> > > So here is what I did: I configure the following repository:
> > >
> > > MCHttpRepository
> > >
> > > location: 'http://www.squeaksource.com/OSProcess'
> > > user: ''
> > > password: ''
> > >
> > > and installed OSProcess-Base dtl.71, OSProcess-AIO dlt.9 and
> > > OSProcess-Unix
> > > dtl.35. But I do not see any mention of a head label. So where did I go
> > > wrong?
> > From the OSProcess repository, load OSProcess-dtl.118. From the CommandShell
> > repository, load CommandShell-dtl.109. These are currently the most recent
> > versions. Ignore the sub-packages such as "OSProcess-Unix", that is
> > something I did to support Pharo (something of a fools errand if I may say
> > so). All of the sub-packages are included in the full OSProcess and
> > CommandShell packages.
> >
> > A shortcut to do this is:
> >
> > Installer ss project: 'OSProcess'; install: 'OSProcess'.
> > Installer ss project: 'CommandShell'; install: 'CommandShell'.
> >
> > The "(head)" version labels in the SqueakMap package loader do the same
> > thing, except for the alarming warning messages which you can safely ignore.
> > > The command you suggested worked. At least if I provide the image to
> > > tesseract as a file. I will go on with the debugger and try to find out
> > > how to feed the image data to stdIn.
> >
> > The basic Unix shell redirector operators #> and #< should work. For
> > example, try evaluating this:
> >
> >    OSProcess outputOf: 'cat < /etc/services | edit'
> >
> > I am not familiar with tesseract, but you can probably use the same
> > approach.
> >
> > Dave
>
>
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Sean P. DeNigris
Administrator
In reply to this post by Martin Kuball
Sorry, I just saw this thread. For the future, I did a small wrapper like
this that might help [1]. It looks like I didn't convert the repo to Tonel
yet and IIRC there are few dependencies other than OSP, so it may load in
Squeak (or could be used for inspiration). I would also be happy to accept
PRs to make it so. I probably will convert to Tonel at some point - although
it looks like Tonel support in Squeak may be imminent, so hopefully no
problem there :)


Martin Kuball wrote
> Mybe it is possible to reuse more of your high-level code.

This is what I ended up with [2]:
        | p result |
        p := PipeableOSProcess waitForCommand: commandString.
        p succeeded ifFalse: [ ^ self error: 'tesseract failed with: ', p
errorUpToEnd ].
        result := self tempFile readStreamDo: [ :str | str contents ].
        self tempFile delete.
        ^ result.

If I had to do it again, I'd probably try via FFI. I've had a longstanding
belief that wrapping command line stuff is an easy way to "get it to work"
and then I can later "get it right" with FFI, but after doing a lot of this
sort of thing, there are so many quirks that I think it might usually be
easier to just start with FFI (although maybe the grass is always
greener...).

[1]. https://github.com/seandenigris/Tesseract-St
[2].
https://github.com/seandenigris/Tesseract-St/blob/master/src/Tesseract.package/Tesseract.class/instance/evaluate..st



-----
Cheers,
Sean
--
Sent from: http://forum.world.st/Squeak-Dev-f45488.html

Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Martin Kuball
In reply to this post by David T. Lewis
Hi David,

Am Dienstag, 26. Mai 2020, 01:28:43 CEST schrieb David T. Lewis:

> Hi Martin,
>
> On Fri, May 22, 2020 at 10:35:08PM +0200, Martin Kuball wrote:
> > Hi David,
> >
> > I finally found a solution to my problem. Successfully converted 1300
> > images in a couple of minutes. The code I come up with is roughly the
> > following (using the wc program here as a proof of concept before
> > switching to tesseract and>
> > image data for the input):
> > | input output stdErr proc accessProtect res err |
> >
> > input := ExternalPipe nonBlockingPipe.
> > output := ExternalPipe blockingPipe.
> > stdErr := ExternalPipe nonBlockingPipe.
> > proc := ExternalOSProcess concreteClass
> >
> > programName: '/usr/bin/wc'
> > arguments: #('-w')
> > initialEnvironment: nil.
> >
> > proc initialStdIn: input reader.
> > proc initialStdOut: output writer.
> > proc initialStdErr: stdErr writer.
> > accessProtect := Semaphore forMutualExclusion.
> > accessProtect critical: [
> >
> > proc value.
> > input nextPutAll: 'this is a test'.
> > input closeWriter.
> > res := output upToEndOfFile.
> > err := stdErr upToEndOfFile
> >
> > ].
> > proc waitForTermination.
> > output close.
> > stdErr close.
> > proc exitStatus ~= 0
> >
> > ifTrue: [^ 'error ' , proc exitStatus , ': ' , err].
> > ifFalse: [^res]
>
> I'm glad it worked out for you!
>
> > Mybe it is possible to reuse more of your high-level code. But I wanted
> > low
> > overhead and complete control over the input and output streams.
>
> Indeed, it can be interesting to work with this at a lower level so you
> can see exactly what is going on.
>
> > By the way uninstalling an older version of OSProcess and installing the
> > new one left a OSProcess watcher with an ObsoleteUnixOSProcessAccessor
> > running.
> This does not surprise me. That would just be the child process watcher that
> was left running after removing all the OSProcess classes, so I guess you
> would have to terminate the old process in that case.

That's exactly what I did ;).

>
> > And a final question: If you want to get a deeper understanding of process
> > handling in squeak, what would you recomend to read? Is the blue book
> > still a good reference?
>
> I think we are talking about two different things here. In Smalltalk, a
> process is very lightweight, more like what you would call a thread in most
> operating systems today. The blue book descriptions are still very relevant.
> There have been some changes and improvements in Squeak over the years, but
> the basics would still be the same.

Sorry, my question really was a little bit ambigous. And it was about the
Smalltalk "internal" process model. Like class Semaphore. I still don't
understand why you use critical: method in many places.
Or e.g. I added a Delay for debugging purposes after starting the external
process. But that did not work. The process exited imediately with an error
that it could not read from stdIn.

>
> The "processes" in OSProcess are a different thing entirely. These refer to
> the processes of the underlying operating system. For an operating system
> like Unix (Linux) or Windows, these processes are heavier-weight, and they
> carry quite a lot of execution context in addition to the basic schedulable
> unit of execution.
>
> If you were to make a comparison, a Process in Squeak is like a "green
> thread" in typical operating system lingo.
>
> But to your question - yes I would start with the blue book, and then others
> on this list (notably Eliot Miranda) can give explanations of the finer
> points and the more recent chnages.
>
> Dave
>
> > Thanks again for your help and the great code.
> >
> > Martin
> >
> > Am Samstag, 16. Mai 2020, 00:13:19 CEST schrieb David T. Lewis:
> > > Hi Martin,
> > >
> > > On Fri, May 15, 2020 at 09:20:17PM +0200, Martin Kuball wrote:
> > > > Am Donnerstag, 14. Mai 2020, 23:27:59 CEST schrieb David T. Lewis:
> > > > > On Thu, May 14, 2020 at 09:37:26PM +0200, Martin Kuball wrote:
> > > > > > Hi!
> > > > > >
> > > > > > I'm trying to do some OCR from squeak using tesseract. I installed
> > > > > > OSProcess. So far so good. But I don't know what classes to use
> > > > > > for
> > > > > > stdIn
> > > > > > and stdOut. My>
> > > > > >
> > > > > > code would look something like this:
> > > > > > | proc stdIn stdOut d |
> > > > > >
> > > > > > stdIn := ???
> > > > > > stdOut := ???
> > > > > > proc := ExternalUnixOSProcess forkAndExec: '/usr/bin/tesseract'
> > > > > > arguments:
> > > > > > #('-' '-' '--dpi' '100') environment: nil descriptors: (Array
> > > > > > with:
> > > > > > stdIn
> > > > > > with: stdOut with: nil).
> > > > > > proc ifNil: [self class noAccessorAvailable].
> > > > > > d := Delay forMilliseconds: 50.
> > > > > > [proc runState == #complete] whileFalse: [d wait].
> > > > > > " and now read the text from stdOut..."
> > > > > >
> > >  > > > Can someone fill in the blanks or point me to code that does
> > >  > > > similar
> > > > > >
> > > > > > things? Thanks very much.
> > > > >
> > > > > Hi Martin,
> > > > >
> > > > > First, please also install CommandShell in addition to OSProcess.
> > > > > Get
> > > > > the
> > > > > latest versions of both OSProcess and CommandShell, regardless of
> > > > > the
> > > > > version of Squeak you are using. If you are using SqueakMap to load
> > > > > them,
> > > > > then please select the versions labelled "(head)".
> > > > >
> > > > > Start out by trying something like this:
> > > > >     OSProcess outputOf: 'tesseract - - --dpi 100'
> > > > >
> > > > > I'm not sure if this will do what you want but please give it a try,
> > > > > and if it does not work I'll try to give a better answer.
> > > > >
> > > > > This uses a couple of new methods that I added to OSPrecess
> > > > > recently,
> > > > > but have not mentioned until now. If it proves to be useful you, you
> > > > > will be the first :-)
> > > > >
> > > > > Assuming that it works, here is what will have happened:
> > > > >
> > > > > - The argument string is parsed into a unix-style command pipeline
> > > > >
> > > > > - The pipeline is all objects, with OS process proxies doing the
> > > > > work
> > > > >
> > > > > - When evaluated, and stderr result will show up in an error
> > > > > notifier
> > > > >
> > > > >   in your image (proceed though the notifier)
> > > > >
> > > > > - Command stdout is collected and answered as the result of
> > > > > #outputOf:
> > > > >
> > > > > I would recommend running this in a debugger so you can step through
> > > > > it and see what is going on.
> > > > >
> > > > > Dave
> > > >
> > > > Hi Dave,
> > > >
> > > > thanks for your answer. Acutally I did install CommandShell because I
> > > > thougth it might help me understand the usage of OSProcess. And maybe
> > > > it
> > > > will if I give it more time.
> > > >
> > > > So here is what I did: I configure the following repository:
> > > >
> > > > MCHttpRepository
> > > >
> > > > location: 'http://www.squeaksource.com/OSProcess'
> > > > user: ''
> > > > password: ''
> > > >
> > > > and installed OSProcess-Base dtl.71, OSProcess-AIO dlt.9 and
> > > > OSProcess-Unix
> > > > dtl.35. But I do not see any mention of a head label. So where did I
> > > > go
> > > > wrong?
> > >
> > > From the OSProcess repository, load OSProcess-dtl.118. From the
> > > CommandShell repository, load CommandShell-dtl.109. These are currently
> > > the most recent versions. Ignore the sub-packages such as
> > > "OSProcess-Unix", that is something I did to support Pharo (something
> > > of a fools errand if I may say so). All of the sub-packages are
> > > included in the full OSProcess and CommandShell packages.
> > >
> > > A shortcut to do this is:
> > > Installer ss project: 'OSProcess'; install: 'OSProcess'.
> > > Installer ss project: 'CommandShell'; install: 'CommandShell'.
> > >
> > > The "(head)" version labels in the SqueakMap package loader do the same
> > > thing, except for the alarming warning messages which you can safely
> > > ignore.> >
> > > > The command you suggested worked. At least if I provide the image to
> > > > tesseract as a file. I will go on with the debugger and try to find
> > > > out
> > > > how to feed the image data to stdIn.
> > >
> > > The basic Unix shell redirector operators #> and #< should work. For
> > >
> > > example, try evaluating this:
> > >    OSProcess outputOf: 'cat < /etc/services | edit'
> > >
> > > I am not familiar with tesseract, but you can probably use the same
> > > approach.
> > >
> > > Dave





Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Martin Kuball
In reply to this post by Sean P. DeNigris
Hi Sean,

your solution with temp files is definitely simpler and more maintainable. But
for sheer ambition I wanted to do it without. But at the moment I have not
enough ambition to do it with FFI. Getting the data structures right is not
that easy. I remember having a hard time writing a connector for the xvid
library more than 10 years ago. But I had a lot more spare time back than.

Martin


Am Dienstag, 26. Mai 2020, 15:44:53 CEST schrieb Sean P. DeNigris:

> Sorry, I just saw this thread. For the future, I did a small wrapper like
> this that might help [1]. It looks like I didn't convert the repo to Tonel
> yet and IIRC there are few dependencies other than OSP, so it may load in
> Squeak (or could be used for inspiration). I would also be happy to accept
> PRs to make it so. I probably will convert to Tonel at some point - although
> it looks like Tonel support in Squeak may be imminent, so hopefully no
> problem there :)
>
>
> Martin Kuball wrote
>
> > Mybe it is possible to reuse more of your high-level code.
>
> This is what I ended up with [2]:
> | p result |
>
> p := PipeableOSProcess waitForCommand: commandString.
> p succeeded ifFalse: [ ^ self error: 'tesseract failed with: ', p
> errorUpToEnd ].
> result := self tempFile readStreamDo: [ :str | str contents ].
> self tempFile delete.
> ^ result.
>
> If I had to do it again, I'd probably try via FFI. I've had a longstanding
> belief that wrapping command line stuff is an easy way to "get it to work"
> and then I can later "get it right" with FFI, but after doing a lot of this
> sort of thing, there are so many quirks that I think it might usually be
> easier to just start with FFI (although maybe the grass is always
> greener...).
>
> [1]. https://github.com/seandenigris/Tesseract-St
> [2].
> https://github.com/seandenigris/Tesseract-St/blob/master/src/Tesseract.packa
> ge/Tesseract.class/instance/evaluate..st
>
>
>
> -----
> Cheers,
> Sean
> --
> Sent from: http://forum.world.st/Squeak-Dev-f45488.html





Reply | Threaded
Open this post in threaded view
|

Re: How to use OSProcess with stdin and stdout

Martin Kuball
In reply to this post by Sean P. DeNigris
Am Dienstag, 26. Mai 2020, 15:44:53 CEST schrieb Sean P. DeNigris:
> https://github.com/seandenigris/Tesseract-St

By the way, how do you open this in monticello? Do I have to manually clone it
first? Thanks.