Hi,
if parsing a certain string according to a grammar fails, how do I get the position of that error? I want to give the user feedback where to start looking for mistakes. Regards, Steffen _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
The short answer is you don't. Instead, where alternates can fail but must not, add a final alternate that is an error node. It must match some content in such a way that you can "continue" the stream. If you intend to not continue the stream, at that point you can just consume all the characters up to the end. Since you hace matched your failure alternative, the actor can announce an error, record its position etc. You can even use the failure match to try and do a recovery, so that more of the stream can be processed.
Cheers, Michael On Sep 7, 2011, at 8:24 AM, Steffen Märcker wrote: > Hi, > > if parsing a certain string according to a grammar fails, how do I get the > position of that error? I want to give the user feedback where to start > looking for mistakes. > > Regards, Steffen > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Thanks Michael. Is my observation correct that the PEG grammar includes
DefinitionError for this purpose? Ciao, Steffen Am 07.09.2011, 17:32 Uhr, schrieb Michael Lucas-Smith <[hidden email]>: > The short answer is you don't. Instead, where alternates can fail but > must not, add a final alternate that is an error node. It must match > some content in such a way that you can "continue" the stream. If you > intend to not continue the stream, at that point you can just consume > all the characters up to the end. Since you hace matched your failure > alternative, the actor can announce an error, record its position etc. > You can even use the failure match to try and do a recovery, so that > more of the stream can be processed. > > Cheers, > Michael > > On Sep 7, 2011, at 8:24 AM, Steffen Märcker wrote: > >> Hi, >> >> if parsing a certain string according to a grammar fails, how do I get >> the >> position of that error? I want to give the user feedback where to start >> looking for mistakes. >> >> Regards, Steffen >> _______________________________________________ >> vwnc mailing list >> [hidden email] >> http://lists.cs.uiuc.edu/mailman/listinfo/vwnc vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Yep, that's right.
2011/9/8 Steffen Märcker <[hidden email]> Thanks Michael. Is my observation correct that the PEG grammar includes _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
How is it reasonable to expect the position of readWriteStream to advance more than 650 characters when writing a 650 character long string to it? Is it related to multi byte character sets or something? Is there some trick to setting up random access file streams for basic ASCII data?
What ever happened to the principle of least surprise? -Carl http://www.libertybasic.com http://www.runbasic.com _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Oh, I forgot to mention it's a stream on a file. Also for what it's worth my code works on 7.4 without this weirdness. Is this a clue?
To create the stream I'm just sending asFilename readWriteStream. Then I set the position to the beginning of a record and use nextPutAll: to write the record. It writes more characters than the size of the string. Thanks, -Carl On Sep 10, 2011, at 5:05 PM, Carl Gundel wrote: > How is it reasonable to expect the position of readWriteStream to advance more than 650 characters when writing a 650 character long string to it? Is it related to multi byte character sets or something? Is there some trick to setting up random access file streams for basic ASCII data? > > What ever happened to the principle of least surprise? > > -Carl > http://www.libertybasic.com > http://www.runbasic.com > _______________________________________________ > vwnc mailing list > [hidden email] > http://lists.cs.uiuc.edu/mailman/listinfo/vwnc _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Carl Gundel
Thanks Eliot,
I tried the following when opening my file but it made no difference. (myFilenameString asFilename withEncoding: #UTF_8) readWriteStream Any ideas? -Carl On Sep 10, 2011, at 5:13 PM, Eliot Miranda wrote:
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Yeah, it turns out to be as simple as it writing CRLF for every CR in my data. Setting UTF_8 encoding doesn't seem to fix it. What's the right way to prevent this?
-Carl On Sep 10, 2011, at 5:47 PM, Carl Gundel wrote:
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Carl Gundel
Do you mean your stream is advancing >650 bytes for 650 characters? That's actually not so surprising - even just a single "Character cr" is worth two bytes on Windows. As Eliot hinted, a non-ASCII character (accented characters, curly quotes, en or em dash etc.) will also be encoded into more than one byte in a lot of common encodings. Even an ASCII character can be encoded into more than one byte in some common encodings. 7.8 has a lot more facilities for that kind of thing than 7.4, because these days the assumption that all users of programs are English speakers in the US is rarely valid.
By default, VW uses the platform's default encoding and line end character. Try stepping into the lower levels of the code to see how these things work, or then read the InternationalGuide manual for a brief introduction.
HTH,
Steve
From: [hidden email] on behalf of Carl Gundel Sent: Sun 11/09/2011 00:47 To: Eliot Miranda Cc: VWNC Subject: Re: [vwnc] [7.8] readWriteStream position wrong? Thanks Eliot,
I tried the following when opening my file but it made no difference.
(myFilenameString asFilename withEncoding: #UTF_8) readWriteStream
Any ideas?
-Carl
On Sep 10, 2011, at 5:13 PM, Eliot Miranda wrote:
_______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
In reply to this post by Carl Gundel
Carl, #lineEndTransparent, but also see #lineEndCR, #lineEndLF, #lineEndCRLF and #lineEndAuto. HTH, -Boris From: [hidden email] [mailto:[hidden email]] On Behalf Of Carl Gundel Yeah, it turns out to be as simple as it writing CRLF for every CR in my data. Setting UTF_8 encoding doesn't seem to fix it. What's the right way to prevent this? -Carl On Sep 10, 2011, at 5:47 PM, Carl Gundel wrote: Thanks Eliot, I tried the following when opening my file but it made no difference. (myFilenameString asFilename withEncoding: #UTF_8) readWriteStream Any ideas? -Carl On Sep 10, 2011, at 5:13 PM, Eliot Miranda wrote: On Sat, Sep 10, 2011 at 2:05 PM, Carl Gundel <[hidden email]> wrote: How is it reasonable to expect the position of readWriteStream to advance more than 650 characters when writing a 650 character long string to it? Is it related to multi byte character sets or something? Is there some trick to setting up random access file streams for basic ASCII data? UTF-8
-- Eliot _______________________________________________ _______________________________________________ vwnc mailing list [hidden email] http://lists.cs.uiuc.edu/mailman/listinfo/vwnc |
Free forum by Nabble | Edit this page |