I have been working with Moose 5.0 on a project involving natural language processing. I used TextLint to parse the input, just because it contained a set of parsers which gave output in a convenient form. I have read all about the virtues of Pharo 4, and decided to give it a try, so I set out to reproduce the earlier results. I have run into a number of problems – basically I haven’t been able to get anywhere. I realise TextLint is rather antique, but basically as I use it it is just a set of PetitParser parsers, so I can’t see a problem. I have boiled the problems down to a few test cases, not realistic bits of my work but just to show the essential points. 1. Environment: Windows 7 Professional (64 bit version). 2. Download and install latest Pharo 4 - Latest update: #40612. This is just to get an up to date VM. 3. Download latest Moose 5.1 image and unzip into same folder as Pharo 4. On opening the Moose 5.1 image it also shows Latest update: #40612. (I have to ignore Firefox’s hysterical warnings that the INRIA site does not have a valid security certificate.) 4. Load TextLint into the image. Can’t get ‘ConfigurationofTextLint’ to work, so load TextLInt-Model-JorgeRessia.225.mcz and TextLint-Tests-lr.166.mcz via the Monticello Browser. This is the same as I used in my earlier Moose 5.0 image, where it worked as expected. 5. Open a playground, enter: PPToken on: ‘test’. and select ‘Inspect it’. The inspector halts with message: ‘MNU ByteString>>find AnySubstring:startingAt:’ Basic Inspect works as expected. Does this look like the problem raised by Nicolas Lusa today? 6. Enter in the playground: word := (TLTextTokenizer parse: ‘test’ startingAt: #word) at: 1. (The tokenizer gives an array of tokens, so we need to select the first – and only – one.) Basic Inspect confirms that ‘word’ is a PPToken, as expected. 7. Enter in the playground: TLWord with: word. Select ‘Do it’. The image becomes completely unresponsive, and appears to be in a tight loop. Only way out is to click the Windows red X and say yes to exit without saving. Restarting, get to the same point and select ‘Debug it’. Trace it to TLSyntacticElement>>initializeWith: aToken, where the only line of code is: ‘token := aToken’. Again a complete lockup. Sorry for the lengthy details; I just hope there is enough there to enable someone to diagnose the problem(s). Any help gratefully received. Meanwhile I am back to the Pharo 3 version. Thanks Peter Kenny |
De : Pharo-users [mailto:[hidden email]]
De la part de PBKResearch Hello, I have been working with Moose 5.0 on a project involving natural language processing. I used TextLint to parse the input, just because it contained a set of parsers which gave output in a convenient
form. I have read all about the virtues of Pharo 4, and decided to give it a try, so I set out to reproduce the earlier results. I have run into a number of problems – basically I haven’t been able to get anywhere. I realise TextLint is rather antique, but
basically as I use it it is just a set of PetitParser parsers, so I can’t see a problem. I have boiled the problems down to a few test cases, not realistic bits of my work but just to show the essential points. 1.
Environment: Windows 7 Professional (64 bit version). 2.
Download and install latest Pharo 4 - Latest update: #40612. This is just to get an up to date VM. 3.
Download latest Moose 5.1 image and unzip into same folder as Pharo 4. On opening the Moose 5.1 image it also shows Latest update: #40612. (I have to ignore Firefox’s hysterical warnings that the INRIA site does not have a valid security
certificate.) 4.
Load TextLint into the image. Can’t get ‘ConfigurationofTextLint’ to work, so load TextLInt-Model-JorgeRessia.225.mcz and TextLint-Tests-lr.166.mcz via the Monticello Browser. This is the same as I used in my earlier Moose 5.0 image,
where it worked as expected. The
ConfigurationofTextLint doesn’t not load because it can’t found the project NEC which is not at 'http://ss3.gemstone.com/ss/NEC', it has to be changed in ConfigurationOfGlamour>>default:. Your configuration should not load Glamour because it is already included in the Moose image.
5.
Open a playground, enter: PPToken on: ‘test’. and select ‘Inspect it’. The inspector halts with message: ‘MNU ByteString>>find AnySubstring:startingAt:’ Basic Inspect works as expected. Does this look like the problem raised by
Nicolas Lusa today? Seems resolved on latest Moose5.1 (#874) on Pharo #40613. 6.
Enter in the playground: word := (TLTextTokenizer parse: ‘test’ startingAt: #word) at: 1. (The tokenizer gives an array of tokens, so we need to select the first – and only – one.) Basic Inspect confirms that ‘word’ is a PPToken, as expected. The code doesn’t work, did you mean : (TLTextTokenizer parse: 'test' startingAt: #word) parsedValue at: 1 ? 7.
Enter in the playground: TLWord with: word. Select ‘Do it’. The image becomes completely unresponsive, and appears to be in a tight loop. Only way out is to click the Windows red X and say yes to exit without saving. Restarting,
get to the same point and select ‘Debug it’. Trace it to TLSyntacticElement>>initializeWith: aToken, where the only line of code is: ‘token := aToken’. Again a complete lockup. Do ALT+. as soon you execute the method, you should be able to find the error. By the way it seems that in TLSyntacticElement>>text, value has to be replaced by inputValue
(value is deprecated, and the notify event seems to loop…). It seems to work after that. Sorry for the lengthy details; I just hope there is enough there to enable someone to diagnose the problem(s). Any help gratefully received. Meanwhile I am back to the Pharo 3 version. Clear details, though. I think you can go back to Pharo 4. Thanks You’re welcome, Vincent Blondeau Peter Kenny Ce message et les pièces jointes sont confidentiels et réservés à l'usage exclusif de ses destinataires. Il peut également être protégé par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant être assurée sur Internet, la responsabilité de Worldline ne pourra être recherchée quant au contenu de ce message. Bien que les meilleurs efforts soient faits pour maintenir cette transmission exempte de tout virus, l'expéditeur ne donne aucune garantie à cet égard et sa responsabilité ne saurait être recherchée pour tout dommage résultant d'un virus transmis. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Worldline liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. |
Vincent Many thanks. You have solved the problem and educated me in the process – since I now know how to do a user interrupt – I was pressing Ctrl-Break with no effect. The problem was the looping of the deprecation notifier, which I can fix by removing the deprecated ‘value’ message. The looping should not have occurred, and there must be a bug somewhere, but it’s not my problem. One little detail, about your comment on point 6 below. On my system, (TLTextTokenizer parse: 'test' startingAt: #word) generates an array of PPToken, so it does not understand parsedValue. Are we looking at different versions of TextLint? Thanks again Peter Kenny From: Pharo-users [mailto:[hidden email]] On Behalf Of Blondeau Vincent De : Pharo-users [[hidden email]] De la part de PBKResearch Hello, I have been working with Moose 5.0 on a project involving natural language processing. I used TextLint to parse the input, just because it contained a set of parsers which gave output in a convenient form. I have read all about the virtues of Pharo 4, and decided to give it a try, so I set out to reproduce the earlier results. I have run into a number of problems – basically I haven’t been able to get anywhere. I realise TextLint is rather antique, but basically as I use it it is just a set of PetitParser parsers, so I can’t see a problem. I have boiled the problems down to a few test cases, not realistic bits of my work but just to show the essential points. 1. Environment: Windows 7 Professional (64 bit version). 2. Download and install latest Pharo 4 - Latest update: #40612. This is just to get an up to date VM. 3. Download latest Moose 5.1 image and unzip into same folder as Pharo 4. On opening the Moose 5.1 image it also shows Latest update: #40612. (I have to ignore Firefox’s hysterical warnings that the INRIA site does not have a valid security certificate.) 4. Load TextLint into the image. Can’t get ‘ConfigurationofTextLint’ to work, so load TextLInt-Model-JorgeRessia.225.mcz and TextLint-Tests-lr.166.mcz via the Monticello Browser. This is the same as I used in my earlier Moose 5.0 image, where it worked as expected. The ConfigurationofTextLint doesn’t not load because it can’t found the project NEC which is not at 'http://ss3.gemstone.com/ss/NEC', it has to be changed in ConfigurationOfGlamour>>default:. Your configuration should not load Glamour because it is already included in the Moose image. 5. Open a playground, enter: PPToken on: ‘test’. and select ‘Inspect it’. The inspector halts with message: ‘MNU ByteString>>find AnySubstring:startingAt:’ Basic Inspect works as expected. Does this look like the problem raised by Nicolas Lusa today? Seems resolved on latest Moose5.1 (#874) on Pharo #40613. 6. Enter in the playground: word := (TLTextTokenizer parse: ‘test’ startingAt: #word) at: 1. (The tokenizer gives an array of tokens, so we need to select the first – and only – one.) Basic Inspect confirms that ‘word’ is a PPToken, as expected. The code doesn’t work, did you mean : (TLTextTokenizer parse: 'test' startingAt: #word) parsedValue at: 1 ? 7. Enter in the playground: TLWord with: word. Select ‘Do it’. The image becomes completely unresponsive, and appears to be in a tight loop. Only way out is to click the Windows red X and say yes to exit without saving. Restarting, get to the same point and select ‘Debug it’. Trace it to TLSyntacticElement>>initializeWith: aToken, where the only line of code is: ‘token := aToken’. Again a complete lockup. Do ALT+. as soon you execute the method, you should be able to find the error. By the way it seems that in TLSyntacticElement>>text, value has to be replaced by inputValue (value is deprecated, and the notify event seems to loop…). It seems to work after that. Sorry for the lengthy details; I just hope there is enough there to enable someone to diagnose the problem(s). Any help gratefully received. Meanwhile I am back to the Pharo 3 version. Clear details, though. I think you can go back to Pharo 4. Thanks You’re welcome, Vincent Blondeau Peter Kenny
|
De : Pharo-users [mailto:[hidden email]]
De la part de PBKResearch Vincent Peter, Many thanks. You have solved the problem and educated me in the process – since I now know how to do a user interrupt – I was pressing Ctrl-Break with no effect. The problem
was the looping of the deprecation notifier, which I can fix by removing the deprecated ‘value’ message. The looping should not have occurred, and there must be a bug somewhere, but it’s not my problem. Actually, the notify message open a debugger and “value” message is send by the debugger, but value send a notify message that opens a new debugger… etc… I don’t think that is a bug. It is not the
notify: method that should be used but the deprecated: one. One little detail, about your comment on point 6 below. On my system, (TLTextTokenizer parse: 'test' startingAt: #word) generates an array of PPToken, so it does not understand
parsedValue. Are we looking at different versions of TextLint? TLTextTokenizer parse: 'test' startingAt: #word return a PPToken, not an array on PPToken in the version I have. I have this version of TextLint :
Name: TextLint-Model-JorgeRessia.225 Author: JorgeRessia Time: 10 February 2012, 2:41:11 pm UUID: 3c6965a4-bc1f-42e4-b309-fab8e4303046 Ancestors: TextLint-Model-lr.224 Name: TextLint-Tests-lr.166 Author: lr Time: 25 March 2012, 10:46:49 am UUID: ed387b31-02e6-4194-91e6-18d3879dc858 Ancestors: TextLint-Tests-DamienCassou.165 And petit parser: Name: PetitParser-JanKurs.278 Author: JanKurs Time: 5 May 2015, 2:39:26.475846 pm UUID: 55ae813a-5dfd-4b4d-a98b-274f5431331e Ancestors: PetitParser-JanKurs.277, PetitParser-JanKurs.276 Vincent Blondeau Thanks again Peter Kenny From: Pharo-users [[hidden email]]
On Behalf Of Blondeau Vincent De : Pharo-users [[hidden email]]
De la part de PBKResearch Hello, I have been working with Moose 5.0 on a project involving natural language processing. I used TextLint to parse the input, just because it contained a set of parsers which gave output in a convenient
form. I have read all about the virtues of Pharo 4, and decided to give it a try, so I set out to reproduce the earlier results. I have run into a number of problems – basically I haven’t been able to get anywhere. I realise TextLint is rather antique, but
basically as I use it it is just a set of PetitParser parsers, so I can’t see a problem. I have boiled the problems down to a few test cases, not realistic bits of my work but just to show the essential points. 1.
Environment: Windows 7 Professional (64 bit version). 2.
Download and install latest Pharo 4 - Latest update: #40612. This is just to get an up to date VM. 3.
Download latest Moose 5.1 image and unzip into same folder as Pharo 4. On opening the Moose 5.1 image it also shows Latest update: #40612. (I have to ignore Firefox’s hysterical warnings that the INRIA site does not have a valid security
certificate.) 4.
Load TextLint into the image. Can’t get ‘ConfigurationofTextLint’ to work, so load TextLInt-Model-JorgeRessia.225.mcz and TextLint-Tests-lr.166.mcz via the Monticello Browser. This is the same as I used in my earlier Moose 5.0 image,
where it worked as expected. The ConfigurationofTextLint
doesn’t not load because it can’t found the project NEC which is not at 'http://ss3.gemstone.com/ss/NEC', it has to be changed in ConfigurationOfGlamour>>default:. Your configuration should not load Glamour because it is already included in the Moose image.
5.
Open a playground, enter: PPToken on: ‘test’. and select ‘Inspect it’. The inspector halts with message: ‘MNU ByteString>>find AnySubstring:startingAt:’ Basic Inspect works as expected. Does this look like the problem raised by
Nicolas Lusa today? Seems resolved on latest Moose5.1 (#874) on Pharo #40613. 6.
Enter in the playground: word := (TLTextTokenizer parse: ‘test’ startingAt: #word) at: 1. (The tokenizer gives an array of tokens, so we need to select the first – and only – one.) Basic Inspect confirms that ‘word’ is a PPToken, as expected. The code doesn’t work, did you mean : (TLTextTokenizer parse: 'test' startingAt: #word) parsedValue at: 1 ? 7.
Enter in the playground: TLWord with: word. Select ‘Do it’. The image becomes completely unresponsive, and appears to be in a tight loop. Only way out is to click the Windows red X and say yes to exit without saving. Restarting,
get to the same point and select ‘Debug it’. Trace it to TLSyntacticElement>>initializeWith: aToken, where the only line of code is: ‘token := aToken’. Again a complete lockup. Do ALT+. as soon you execute the method, you should be able to find the error. By the way it seems that in TLSyntacticElement>>text, value has to be replaced by inputValue (value is deprecated,
and the notify event seems to loop…). It seems to work after that. Sorry for the lengthy details; I just hope there is enough there to enable someone to diagnose the problem(s). Any help gratefully received. Meanwhile I am back to the Pharo 3 version. Clear details, though. I think you can go back to Pharo 4. Thanks You’re welcome, Vincent Blondeau Peter Kenny
Ce message et les pièces jointes sont confidentiels et réservés à l'usage exclusif de ses destinataires. Il peut également être protégé par le secret professionnel. Si vous recevez ce message par erreur, merci d'en avertir immédiatement l'expéditeur et de le détruire. L'intégrité du message ne pouvant être assurée sur Internet, la responsabilité de Worldline ne pourra être recherchée quant au contenu de ce message. Bien que les meilleurs efforts soient faits pour maintenir cette transmission exempte de tout virus, l'expéditeur ne donne aucune garantie à cet égard et sa responsabilité ne saurait être recherchée pour tout dommage résultant d'un virus transmis. This e-mail and the documents attached are confidential and intended solely for the addressee; it may also be privileged. If you receive this e-mail in error, please notify the sender immediately and destroy it. As its integrity cannot be secured on the Internet, the Worldline liability cannot be triggered for the message content. Although the sender endeavours to maintain a computer virus-free network, the sender does not warrant that this transmission is virus-free and will not be liable for any damages resulting from any virus transmitted. |
Vincent Sorry for the delay in replying. The point about (TLTextTokenizer parse: 'test' startingAt: #word) is weird. You are quite right, it does give a PPToken in Pharo #40613. In #40612 it gave an array of PPToken – I checked this before sending the original message and again after receiving your first reply, and I have now rescued my #40612 image from the recycle bin and tested it again. We are using the same version, by the way – except that I have now modified mine to remove the deprecated method ‘value’. The behaviour of TLTextTokenizer in the latest image is very odd. It seems to parse only the first word of any text even when I don’t specify startingAt: #word. For instance, if I enter (TLTextTokenizer parse: 'test run' ), the result is aPPToken on ‘test’. However, it is not clear that we are meant to use TLTextTokenizer at all; the organisation of the TextLint parsers suggests this is an abstract superclass, and the checkers seem to use TLPatternTokenizer or TLPlainTokenizer and its subclasses exclusively. I have used TLPlainTokenizer with only one oddity, namely that it would not display the result of (TLPlainTokenizer parse: ‘test run’) in the PetitParser browser until I changed (TLTerminatorMark with: ‘’) to (TLTerminatorMark with: (PPToken on: ‘’)) in the elementList method. I have spent probably far too long today exploring TextLint, particularly trying to find why so many of the tests fail. It looks as though, since many of them were written, PetitParser has been changed in many details. Anyway, I can now use TLTextLintChecker to give me a parse of any plain text as a structured document, which was my original objective, so I can get back to my project - using Pharo 4. Thanks for your help. Peter Kenny From: Pharo-users [mailto:[hidden email]] On Behalf Of Blondeau Vincent De : Pharo-users [[hidden email]] De la part de PBKResearch Vincent Peter, Many thanks. You have solved the problem and educated me in the process – since I now know how to do a user interrupt – I was pressing Ctrl-Break with no effect. The problem was the looping of the deprecation notifier, which I can fix by removing the deprecated ‘value’ message. The looping should not have occurred, and there must be a bug somewhere, but it’s not my problem. Actually, the notify message open a debugger and “value” message is send by the debugger, but value send a notify message that opens a new debugger… etc… I don’t think that is a bug. It is not the notify: method that should be used but the deprecated: one. One little detail, about your comment on point 6 below. On my system, (TLTextTokenizer parse: 'test' startingAt: #word) generates an array of PPToken, so it does not understand parsedValue. Are we looking at different versions of TextLint? TLTextTokenizer parse: 'test' startingAt: #word return a PPToken, not an array on PPToken in the version I have. I have this version of TextLint : Name: TextLint-Model-JorgeRessia.225 Author: JorgeRessia Time: 10 February 2012, 2:41:11 pm UUID: 3c6965a4-bc1f-42e4-b309-fab8e4303046 Ancestors: TextLint-Model-lr.224 Name: TextLint-Tests-lr.166 Author: lr Time: 25 March 2012, 10:46:49 am UUID: ed387b31-02e6-4194-91e6-18d3879dc858 Ancestors: TextLint-Tests-DamienCassou.165 And petit parser: Name: PetitParser-JanKurs.278 Author: JanKurs Time: 5 May 2015, 2:39:26.475846 pm UUID: 55ae813a-5dfd-4b4d-a98b-274f5431331e Ancestors: PetitParser-JanKurs.277, PetitParser-JanKurs.276 Vincent Blondeau Thanks again Peter Kenny From: Pharo-users [[hidden email]] On Behalf Of Blondeau Vincent De : Pharo-users [[hidden email]] De la part de PBKResearch Hello, I have been working with Moose 5.0 on a project involving natural language processing. I used TextLint to parse the input, just because it contained a set of parsers which gave output in a convenient form. I have read all about the virtues of Pharo 4, and decided to give it a try, so I set out to reproduce the earlier results. I have run into a number of problems – basically I haven’t been able to get anywhere. I realise TextLint is rather antique, but basically as I use it it is just a set of PetitParser parsers, so I can’t see a problem. I have boiled the problems down to a few test cases, not realistic bits of my work but just to show the essential points. 1. Environment: Windows 7 Professional (64 bit version). 2. Download and install latest Pharo 4 - Latest update: #40612. This is just to get an up to date VM. 3. Download latest Moose 5.1 image and unzip into same folder as Pharo 4. On opening the Moose 5.1 image it also shows Latest update: #40612. (I have to ignore Firefox’s hysterical warnings that the INRIA site does not have a valid security certificate.) 4. Load TextLint into the image. Can’t get ‘ConfigurationofTextLint’ to work, so load TextLInt-Model-JorgeRessia.225.mcz and TextLint-Tests-lr.166.mcz via the Monticello Browser. This is the same as I used in my earlier Moose 5.0 image, where it worked as expected. The ConfigurationofTextLint doesn’t not load because it can’t found the project NEC which is not at 'http://ss3.gemstone.com/ss/NEC', it has to be changed in ConfigurationOfGlamour>>default:. Your configuration should not load Glamour because it is already included in the Moose image. 5. Open a playground, enter: PPToken on: ‘test’. and select ‘Inspect it’. The inspector halts with message: ‘MNU ByteString>>find AnySubstring:startingAt:’ Basic Inspect works as expected. Does this look like the problem raised by Nicolas Lusa today? Seems resolved on latest Moose5.1 (#874) on Pharo #40613. 6. Enter in the playground: word := (TLTextTokenizer parse: ‘test’ startingAt: #word) at: 1. (The tokenizer gives an array of tokens, so we need to select the first – and only – one.) Basic Inspect confirms that ‘word’ is a PPToken, as expected. The code doesn’t work, did you mean : (TLTextTokenizer parse: 'test' startingAt: #word) parsedValue at: 1 ? 7. Enter in the playground: TLWord with: word. Select ‘Do it’. The image becomes completely unresponsive, and appears to be in a tight loop. Only way out is to click the Windows red X and say yes to exit without saving. Restarting, get to the same point and select ‘Debug it’. Trace it to TLSyntacticElement>>initializeWith: aToken, where the only line of code is: ‘token := aToken’. Again a complete lockup. Do ALT+. as soon you execute the method, you should be able to find the error. By the way it seems that in TLSyntacticElement>>text, value has to be replaced by inputValue (value is deprecated, and the notify event seems to loop…). It seems to work after that. Sorry for the lengthy details; I just hope there is enough there to enable someone to diagnose the problem(s). Any help gratefully received. Meanwhile I am back to the Pharo 3 version. Clear details, though. I think you can go back to Pharo 4. Thanks You’re welcome, Vincent Blondeau Peter Kenny
|
Free forum by Nabble | Edit this page |