Patch for parsing dates and time

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Patch for parsing dates and time

Maarten van Beek
Hello,

While looking at the dates tests with Holger we've noticed that these
tests weren't testing the parsing of dates right. It would check
whether parsing a date would yield the same object as parsing a date
with an added suffix and it checked whether a stream on the date
string with the added suffix indeed included the added suffix. It
should test whether a stream with an added suffix would still contain
this added suffix after the date was parsed. I've edited the tests and
noticed that this revealed some bugs, the suffix wasn't always left
properly in the stream.

The following patch should fix this issue. It patches tests/dates.st
aswel as dates.ok as this contained some errors. Also it fixes bugs in
Date.st, AnsiDates.st and Time.st as they all contained a similar bug
that wouldn't leave the suffix intact.

I hope someone can find the time to review this patch.

Maarten van Beek

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk

dates.patch (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Patch for parsing dates and time

Paolo Bonzini-2
I admit I haven't looked at the patch, only the testsuite results.

Il 23/07/2012 04:32, Maarten van Beek ha scritto:

> diff --git a/tests/dates.ok b/tests/dates.ok
> index 1bbfbd6..3e064d1 100644
> --- a/tests/dates.ok
> +++ b/tests/dates.ok
> @@ -830,10 +830,10 @@ Execution begins...
>  1-Feb-2011->'2011-02-01'
>  1-Feb-2011->'Feb 1 2011'
>  1-Feb-2011->'Feb 1 2011'
> -1-Feb--2011->'-2011-02-01'
> -1-Feb--2011->'-2011-02-01'
> -1-Feb--2011->'Feb 1 -2011'
> -1-Feb--2011->'Feb 1 -2011'
> +1-Feb-2011->'-2011-02-01'
> +1-Feb-2011->'-2011-02-01'
> +1-Feb-2011->'Feb 1 -2011'
> +1-Feb-2011->'Feb 1 -2011'

Here the year is "-2011" before your patch and "+2011" after.

>  9:00:02->'09:00:02'
>  9:00:02->'09:00:02'
>  9:00:02->'09:00:02'
> @@ -996,12 +996,12 @@ Execution begins...
>   2011-02-01T00:00:00+00:00->'2011-02-01'
>   2011-02-01T00:00:00+00:00->'Feb 1 2011'
>   2011-02-01T00:00:00+00:00->'Feb 1 2011'
> --2011-02-01T09:00:00+01:30->'-2011-02-01T09:00+01:30'
> --2011-02-01T09:00:00+01:30->'-2011-02-01T09:00+01:30'
> --2011-02-01T00:00:00+00:00->'-2011-02-01'
> --2011-02-01T00:00:00+00:00->'-2011-02-01'
> --2011-02-01T00:00:00+00:00->'Feb 1 -2011'
> --2011-02-01T00:00:00+00:00->'Feb 1 -2011'
> + 2011-02-01T09:00:00+01:30->'-2011-02-01T09:00+01:30'
> + 2011-02-01T09:00:00+01:30->'-2011-02-01T09:00+01:30'
> + 2011-02-01T00:00:00+00:00->'-2011-02-01'
> + 2011-02-01T00:00:00+00:00->'-2011-02-01'
> + 2011-02-01T00:00:00+00:00->'Feb 1 -2011'
> + 2011-02-01T00:00:00+00:00->'Feb 1 -2011'

Same here.

So I'm not sure exactly where the bug is.  Can you give an example of
the case where a stream with an added suffix would not contain
this added suffix after the date was parsed?

Also, unfortunately using #position: is not possible in #readFrom:,
because not all Streams support it (think Sockets).  However, using
#peek is possible.

Paolo


>   2011-02-01T09:00:00+01:30->'Feb 1 2011 09:00  +01:30'
>   2011-02-01T09:00:00+01:30->'Feb 1 2011 09:00  +01:30'
>   2011-02-01T09:00:00+01:30->'Feb 1 2011 09:00  +01:30'



_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Patch for parsing dates and time

Holger Freyther
In reply to this post by Maarten van Beek
On Mon, Jul 23, 2012 at 04:32:27AM +0200, Maarten van Beek wrote:
> The following patch should fix this issue. It patches tests/dates.st
> aswel as dates.ok as this contained some errors. Also it fixes bugs in
> Date.st, AnsiDates.st and Time.st as they all contained a similar bug
> that wouldn't leave the suffix intact.

Hi,

could you please explain why the output of dates.ok changes? E.g.
explain why the previous output was wrong and the new one is correct. E.g.
I think it should still be 1-Feb--2011 if it was B.C.?

thanks
        holger


> _______________________________________________
> help-smalltalk mailing list
> [hidden email]
> https://lists.gnu.org/mailman/listinfo/help-smalltalk


_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Patch for parsing dates and time

Maarten van Beek
Indeed I didn't realise negative dates where possible I thought it was
an error. I'll fix that. However the following code shows the bug:

        stream := '2011-02-01abcd' readStream.
        date := Date readFrom: stream.
        result := stream upToEnd.

        result printNl.

The suffix to the date is abcd, so after parsing the date the stream
should still contain abcd. However result only holds 'bcd'. This is
not tested in the current test cases. Time, Duration and DateTime have
the same bug.
Not being able to use position: is a problem though. When parsing this
time for example: '09:00::1234' you should read 09:00 as time and
leave ::1234 in the stream, but you have to look ahead 2 characters to
notice this.

Maarten van Beek


On Mon, Jul 23, 2012 at 9:16 AM, Holger Hans Peter Freyther
<[hidden email]> wrote:

> On Mon, Jul 23, 2012 at 04:32:27AM +0200, Maarten van Beek wrote:
>> The following patch should fix this issue. It patches tests/dates.st
>> aswel as dates.ok as this contained some errors. Also it fixes bugs in
>> Date.st, AnsiDates.st and Time.st as they all contained a similar bug
>> that wouldn't leave the suffix intact.
>
> Hi,
>
> could you please explain why the output of dates.ok changes? E.g.
> explain why the previous output was wrong and the new one is correct. E.g.
> I think it should still be 1-Feb--2011 if it was B.C.?
>
> thanks
>         holger
>
>
>> _______________________________________________
>> help-smalltalk mailing list
>> [hidden email]
>> https://lists.gnu.org/mailman/listinfo/help-smalltalk
>
>
> _______________________________________________
> help-smalltalk mailing list
> [hidden email]
> https://lists.gnu.org/mailman/listinfo/help-smalltalk

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Patch for parsing dates and time

Paolo Bonzini-2
Il 23/07/2012 15:47, Maarten van Beek ha scritto:

> Indeed I didn't realise negative dates where possible I thought it was
> an error. I'll fix that. However the following code shows the bug:
>
>         stream := '2011-02-01abcd' readStream.
>         date := Date readFrom: stream.
>         result := stream upToEnd.
>
>         result printNl.
>
> The suffix to the date is abcd, so after parsing the date the stream
> should still contain abcd. However result only holds 'bcd'. This is
> not tested in the current test cases. Time, Duration and DateTime have
> the same bug.

Ok, thanks.

> Not being able to use position: is a problem though. When parsing this
> time for example: '09:00::1234' you should read 09:00 as time and
> leave ::1234 in the stream, but you have to look ahead 2 characters to
> notice this.

I think it's safe to consider this example ill-formed, and not care
about it.

Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk
Reply | Threaded
Open this post in threaded view
|

Re: Patch for parsing dates and time

Maarten van Beek
The attached patch accepts negative dates and doesn't use position:
anymore. I've removed the test cases that weren't possible to succeed
because of this. This includes some test cases for DateTime, where
DateTime skips over all separators to see if there's any usable
content left. These separators are thrown away.

Maarten van Beek

On Mon, Jul 23, 2012 at 4:57 PM, Paolo Bonzini <[hidden email]> wrote:

> Il 23/07/2012 15:47, Maarten van Beek ha scritto:
>> Indeed I didn't realise negative dates where possible I thought it was
>> an error. I'll fix that. However the following code shows the bug:
>>
>>         stream := '2011-02-01abcd' readStream.
>>         date := Date readFrom: stream.
>>         result := stream upToEnd.
>>
>>         result printNl.
>>
>> The suffix to the date is abcd, so after parsing the date the stream
>> should still contain abcd. However result only holds 'bcd'. This is
>> not tested in the current test cases. Time, Duration and DateTime have
>> the same bug.
>
> Ok, thanks.
>
>> Not being able to use position: is a problem though. When parsing this
>> time for example: '09:00::1234' you should read 09:00 as time and
>> leave ::1234 in the stream, but you have to look ahead 2 characters to
>> notice this.
>
> I think it's safe to consider this example ill-formed, and not care
> about it.
>
> Paolo

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk

dates.patch2 (21K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Patch for parsing dates and time

Holger Freyther
On Mon, Jul 23, 2012 at 10:24:56PM +0200, Maarten van Beek wrote:
> The attached patch accepts negative dates and doesn't use position:
> anymore. I've removed the test cases that weren't possible to succeed
> because of this. This includes some test cases for DateTime, where
> DateTime skips over all separators to see if there's any usable
> content left. These separators are thrown away.

Hi,

in general it is good to test what can be parsed and what can not
be parsed. E.g. if someone wants to parse 09:00::1234 as 9:00 and
it fails it easy to point him to the testcase that excludes this
behavior, Pharo1.4 will throw an exception when parsing the above
string as a Time, I think it would be acceptable to do the same.

Somehow I am not able to quote the attachment from within mutt but
you have some unused variables (e.g. you added 'pos' but don't make
use of it).

For Paolo I have included the diff of expout to stdout when using
the right stream for parsing and then comparing what is left inside
the stream. Maarten's test results are already a lot better.

holger


_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk

failed_tests.tx (9K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Patch for parsing dates and time

Maarten van Beek
Here's the new patch, I removed the unused variables and added extra
tests which consisting of the tests that are expected to fail.

Maarten van Beek

On Tue, Jul 24, 2012 at 11:16 AM, Holger Hans Peter Freyther
<[hidden email]> wrote:

> On Mon, Jul 23, 2012 at 10:24:56PM +0200, Maarten van Beek wrote:
>> The attached patch accepts negative dates and doesn't use position:
>> anymore. I've removed the test cases that weren't possible to succeed
>> because of this. This includes some test cases for DateTime, where
>> DateTime skips over all separators to see if there's any usable
>> content left. These separators are thrown away.
>
> Hi,
>
> in general it is good to test what can be parsed and what can not
> be parsed. E.g. if someone wants to parse 09:00::1234 as 9:00 and
> it fails it easy to point him to the testcase that excludes this
> behavior, Pharo1.4 will throw an exception when parsing the above
> string as a Time, I think it would be acceptable to do the same.
>
> Somehow I am not able to quote the attachment from within mutt but
> you have some unused variables (e.g. you added 'pos' but don't make
> use of it).
>
> For Paolo I have included the diff of expout to stdout when using
> the right stream for parsing and then comparing what is left inside
> the stream. Maarten's test results are already a lot better.
>
> holger
>
>
> _______________________________________________
> help-smalltalk mailing list
> [hidden email]
> https://lists.gnu.org/mailman/listinfo/help-smalltalk
>

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk

dates3.patch (21K) Download Attachment
dates-failing.ok (1K) Download Attachment
dates-failing.st (4K) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Patch for parsing dates and time

Maarten van Beek
Here's an updated version of the patch, I've removed unnecessary stuff from
this one.
dates-failing.ok and dates-failing.st belong in the tests folder. I'm not
able to check my mail
until the 15th of august, so until then I won't be able to process any
remarks.

Maarten van Beek

On Wed, Jul 25, 2012 at 1:20 AM, Maarten van Beek
<[hidden email]>wrote:

> Here's the new patch, I removed the unused variables and added extra
> tests which consisting of the tests that are expected to fail.
>
> Maarten van Beek
>
> On Tue, Jul 24, 2012 at 11:16 AM, Holger Hans Peter Freyther
> <[hidden email]> wrote:
> > On Mon, Jul 23, 2012 at 10:24:56PM +0200, Maarten van Beek wrote:
> >> The attached patch accepts negative dates and doesn't use position:
> >> anymore. I've removed the test cases that weren't possible to succeed
> >> because of this. This includes some test cases for DateTime, where
> >> DateTime skips over all separators to see if there's any usable
> >> content left. These separators are thrown away.
> >
> > Hi,
> >
> > in general it is good to test what can be parsed and what can not
> > be parsed. E.g. if someone wants to parse 09:00::1234 as 9:00 and
> > it fails it easy to point him to the testcase that excludes this
> > behavior, Pharo1.4 will throw an exception when parsing the above
> > string as a Time, I think it would be acceptable to do the same.
> >
> > Somehow I am not able to quote the attachment from within mutt but
> > you have some unused variables (e.g. you added 'pos' but don't make
> > use of it).
> >
> > For Paolo I have included the diff of expout to stdout when using
> > the right stream for parsing and then comparing what is left inside
> > the stream. Maarten's test results are already a lot better.
> >
> > holger
> >
> >
> > _______________________________________________
> > help-smalltalk mailing list
> > [hidden email]
> > https://lists.gnu.org/mailman/listinfo/help-smalltalk
> >
>

_______________________________________________
help-smalltalk mailing list
[hidden email]
https://lists.gnu.org/mailman/listinfo/help-smalltalk

dates4.patch (20K) Download Attachment
dates-failing.ok (1K) Download Attachment
dates-failing.st (4K) Download Attachment