hi guys
can one of you give a look at the this fix because I'm confused. Kernel-DavidHotham.538 year < 20 ifTrue: [year := 2000 + year] was year < 10 ifTrue: [year := 2000 + year] both solutions look strange to me. http://code.google.com/p/pharo/issues/detail?id=1749 readFrom: aStream "Read a Date from the stream in any of the forms: <day> <monthName> <year> (5 April 1982; 5-APR-82) <monthName> <day> <year> (April 5, 1982) <monthNumber> <day> <year> (4/5/82) <day><monthName><year> (5APR82)" | day month year | aStream peek isDigit ifTrue: [day := Integer readFrom: aStream]. [aStream peek isAlphaNumeric] whileFalse: [aStream skip: 1]. aStream peek isLetter ifTrue: ["number/name... or name..." month := (String new: 10) writeStream. [aStream peek isLetter] whileTrue: [month nextPut: aStream next]. month := month contents. day isNil ifTrue: ["name/number..." [aStream peek isAlphaNumeric] whileFalse: [aStream skip: 1]. day := Integer readFrom: aStream]] ifFalse: ["number/number..." month := Month nameOfMonth: day. day := Integer readFrom: aStream]. [aStream peek isAlphaNumeric] whileFalse: [aStream skip: 1]. year := Integer readFrom: aStream. >> year < 20 ifTrue: [year := 2000 + year] ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. ^ self year: year month: month day: day _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
don't know if this addresses your confusion:
before: '5-APR-10' got parsed as April 5, 1910 after: '5-APR-10' gets parsed as April 5, 2010 so it kind of fixes the passing of time :-) This kind of cleverness should not be there, IMHO. a) you want to parse special date formats: you should use some explicit format descriptions (that %M%D%Y stuff) and make it explicit. b) you want to make some best effort to parse whatever string as a date: that is a heuristic and it should be in another method, not something as basic as #readFrom:. Maybe call the method guessFrom: aStream ;-) Cheers Matthias On Sat, Jan 16, 2010 at 10:57 AM, Stéphane Ducasse <[hidden email]> wrote: > hi guys > > can one of you give a look at the this fix because I'm confused. Kernel-DavidHotham.538 > > year < 20 ifTrue: [year := 2000 + year] > was > year < 10 ifTrue: [year := 2000 + year] > > both solutions look strange to me. > > http://code.google.com/p/pharo/issues/detail?id=1749 > > > readFrom: aStream > "Read a Date from the stream in any of the forms: > <day> <monthName> <year> (5 April 1982; 5-APR-82) > <monthName> <day> <year> (April 5, 1982) > <monthNumber> <day> <year> (4/5/82) > <day><monthName><year> (5APR82)" > | day month year | > aStream peek isDigit > ifTrue: [day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > aStream peek isLetter > ifTrue: ["number/name... or name..." > month := (String new: 10) writeStream. > [aStream peek isLetter] > whileTrue: [month nextPut: aStream next]. > month := month contents. > day isNil > ifTrue: ["name/number..." > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > day := Integer readFrom: aStream]] > ifFalse: ["number/number..." > month := Month nameOfMonth: day. > day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > year := Integer readFrom: aStream. >>> year < 20 ifTrue: [year := 2000 + year] > ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. > > ^ self > year: year > month: month > day: day > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
On Jan 16, 2010, at 11:07 AM, Matthias Berth wrote: > don't know if this addresses your confusion: > > before: '5-APR-10' got parsed as April 5, 1910 > after: '5-APR-10' gets parsed as April 5, 2010 > > so it kind of fixes the passing of time :-) > > This kind of cleverness should not be there, IMHO. +1 Confirmed. > > a) you want to parse special date formats: you should use some > explicit format descriptions (that %M%D%Y stuff) and make it explicit. > > b) you want to make some best effort to parse whatever string as a > date: that is a heuristic and it should be in another method, not > something as basic as #readFrom:. Maybe call the method guessFrom: > aStream ;-) I agree because in 10 years we will have to <30 and it is plain wrong anyway :) Stef > > Cheers > > Matthias > > On Sat, Jan 16, 2010 at 10:57 AM, Stéphane Ducasse > <[hidden email]> wrote: >> hi guys >> >> can one of you give a look at the this fix because I'm confused. Kernel-DavidHotham.538 >> >> year < 20 ifTrue: [year := 2000 + year] >> was >> year < 10 ifTrue: [year := 2000 + year] >> >> both solutions look strange to me. >> >> http://code.google.com/p/pharo/issues/detail?id=1749 >> >> >> readFrom: aStream >> "Read a Date from the stream in any of the forms: >> <day> <monthName> <year> (5 April 1982; 5-APR-82) >> <monthName> <day> <year> (April 5, 1982) >> <monthNumber> <day> <year> (4/5/82) >> <day><monthName><year> (5APR82)" >> | day month year | >> aStream peek isDigit >> ifTrue: [day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> aStream peek isLetter >> ifTrue: ["number/name... or name..." >> month := (String new: 10) writeStream. >> [aStream peek isLetter] >> whileTrue: [month nextPut: aStream next]. >> month := month contents. >> day isNil >> ifTrue: ["name/number..." >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> day := Integer readFrom: aStream]] >> ifFalse: ["number/number..." >> month := Month nameOfMonth: day. >> day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> year := Integer readFrom: aStream. >>>> year < 20 ifTrue: [year := 2000 + year] >> ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. >> >> ^ self >> year: year >> month: month >> day: day >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Stéphane Ducasse
Hello,
The motivation behind this fix is that I am parsing historical stock data obtained from Google, in which dates are provided like this: "15-Jan-10". Of course I want this to read as 2010, not 1910. I agree that the existing solution (with or without my tweak) looks ugly. I gave brief consideration to a bigger fix which first checked what the current year was and then calculated what century to add to a low date. But, frankly, I couldn't be bothered. I figure that a one-character fix that is good for ten years isn't too bad. I am quite content to defer to others as to what the right solution is here. So long as there ends up being a convenient way to read two-digit years correctly, I'll be happy. David "Stéphane Ducasse" <[hidden email]> wrote in message news:[hidden email]... > hi guys > > can one of you give a look at the this fix because I'm confused. > Kernel-DavidHotham.538 > > year < 20 ifTrue: [year := 2000 + year] > was > year < 10 ifTrue: [year := 2000 + year] > > both solutions look strange to me. > > http://code.google.com/p/pharo/issues/detail?id=1749 > > > readFrom: aStream > "Read a Date from the stream in any of the forms: > <day> <monthName> <year> (5 April 1982; 5-APR-82) > <monthName> <day> <year> (April 5, 1982) > <monthNumber> <day> <year> (4/5/82) > <day><monthName><year> (5APR82)" > | day month year | > aStream peek isDigit > ifTrue: [day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > aStream peek isLetter > ifTrue: ["number/name... or name..." > month := (String new: 10) writeStream. > [aStream peek isLetter] > whileTrue: [month nextPut: aStream next]. > month := month contents. > day isNil > ifTrue: ["name/number..." > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > day := Integer readFrom: aStream]] > ifFalse: ["number/number..." > month := Month nameOfMonth: day. > day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > year := Integer readFrom: aStream. >>> year < 20 ifTrue: [year := 2000 + year] > ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. > > ^ self > year: year > month: month > day: day _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
no problem
we were not criticizing your fix more the problem in general Would be nice to have a guessFrom: Stef On Jan 16, 2010, at 12:07 PM, David Hotham wrote: > Hello, > > The motivation behind this fix is that I am parsing historical stock data > obtained from Google, in which dates are provided like this: "15-Jan-10". > Of course I want this to read as 2010, not 1910. > > I agree that the existing solution (with or without my tweak) looks ugly. > > I gave brief consideration to a bigger fix which first checked what the > current year was and then calculated what century to add to a low date. > But, frankly, I couldn't be bothered. I figure that a one-character fix > that is good for ten years isn't too bad. > > I am quite content to defer to others as to what the right solution is here. > So long as there ends up being a convenient way to read two-digit years > correctly, I'll be happy. > > David > > > > "Stéphane Ducasse" <[hidden email]> wrote in > message news:[hidden email]... >> hi guys >> >> can one of you give a look at the this fix because I'm confused. >> Kernel-DavidHotham.538 >> >> year < 20 ifTrue: [year := 2000 + year] >> was >> year < 10 ifTrue: [year := 2000 + year] >> >> both solutions look strange to me. >> >> http://code.google.com/p/pharo/issues/detail?id=1749 >> >> >> readFrom: aStream >> "Read a Date from the stream in any of the forms: >> <day> <monthName> <year> (5 April 1982; 5-APR-82) >> <monthName> <day> <year> (April 5, 1982) >> <monthNumber> <day> <year> (4/5/82) >> <day><monthName><year> (5APR82)" >> | day month year | >> aStream peek isDigit >> ifTrue: [day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> aStream peek isLetter >> ifTrue: ["number/name... or name..." >> month := (String new: 10) writeStream. >> [aStream peek isLetter] >> whileTrue: [month nextPut: aStream next]. >> month := month contents. >> day isNil >> ifTrue: ["name/number..." >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> day := Integer readFrom: aStream]] >> ifFalse: ["number/number..." >> month := Month nameOfMonth: day. >> day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> year := Integer readFrom: aStream. >>>> year < 20 ifTrue: [year := 2000 + year] >> ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. >> >> ^ self >> year: year >> month: month >> day: day > > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
This whole subject is my number one pet hate: ambiguous date
representation. Seriously, after y2k and all the associated messing about fixing broken dates, one would think people would want to prevent anything like that ever happening again! IMNSHO, *all* dates should either be expressed as YYYY-MM-DD, or in a form using the three elements: month name; two-digit day number; four- digit year, in arbitrary order. Otherwise there will always be ambiguity, which will lessen slightly in 2013, and slightly more in 2032, but will never go away as long as people insist on using "10/6/11" and calling it a date. As for the problem in hand, I think the best solution would be to reject out of hand Strings that are ambiguous, and develop a tool for processing *collections* of "date" strings. This tool could infer which elements represented what by looking at the whole collection, and noticing patterns that give the clues needed to make the best guess, in much the same way as a human reader does. -- Cheers, Peter. On 16 jan 2010, at 13.39, Stéphane Ducasse <[hidden email]> wrote: > no problem > we were not criticizing your fix more the problem in general > Would be nice to have a guessFrom: > > Stef > > On Jan 16, 2010, at 12:07 PM, David Hotham wrote: > >> Hello, >> >> The motivation behind this fix is that I am parsing historical >> stock data >> obtained from Google, in which dates are provided like this: "15- >> Jan-10". >> Of course I want this to read as 2010, not 1910. >> >> I agree that the existing solution (with or without my tweak) looks >> ugly. >> >> I gave brief consideration to a bigger fix which first checked what >> the >> current year was and then calculated what century to add to a low >> date. >> But, frankly, I couldn't be bothered. I figure that a one- >> character fix >> that is good for ten years isn't too bad. >> >> I am quite content to defer to others as to what the right solution >> is here. >> So long as there ends up being a convenient way to read two-digit >> years >> correctly, I'll be happy. >> >> David >> >> >> >> "Stéphane Ducasse" <[hidden email]> wrote in >> message news:[hidden email]... >>> hi guys >>> >>> can one of you give a look at the this fix because I'm confused. >>> Kernel-DavidHotham.538 >>> >>> year < 20 ifTrue: [year := 2000 + year] >>> was >>> year < 10 ifTrue: [year := 2000 + year] >>> >>> both solutions look strange to me. >>> >>> http://code.google.com/p/pharo/issues/detail?id=1749 >>> >>> >>> readFrom: aStream >>> "Read a Date from the stream in any of the forms: >>> <day> <monthName> <year> (5 April 1982; 5-APR-82) >>> <monthName> <day> <year> (April 5, 1982) >>> <monthNumber> <day> <year> (4/5/82) >>> <day><monthName><year> (5APR82)" >>> | day month year | >>> aStream peek isDigit >>> ifTrue: [day := Integer readFrom: aStream]. >>> [aStream peek isAlphaNumeric] >>> whileFalse: [aStream skip: 1]. >>> aStream peek isLetter >>> ifTrue: ["number/name... or name..." >>> month := (String new: 10) writeStream. >>> [aStream peek isLetter] >>> whileTrue: [month nextPut: aStream next]. >>> month := month contents. >>> day isNil >>> ifTrue: ["name/number..." >>> [aStream peek isAlphaNumeric] >>> whileFalse: [aStream skip: 1]. >>> day := Integer readFrom: aStream]] >>> ifFalse: ["number/number..." >>> month := Month nameOfMonth: day. >>> day := Integer readFrom: aStream]. >>> [aStream peek isAlphaNumeric] >>> whileFalse: [aStream skip: 1]. >>> year := Integer readFrom: aStream. >>>>> year < 20 ifTrue: [year := 2000 + year] >>> ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. >>> >>> ^ self >>> year: year >>> month: month >>> day: day >> >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Stéphane Ducasse
I'm afraid the 'fix' is trying to repair something is not broken!
Date fromString: '6-Jan-10'. giving "6 January 1910" has not anything wrong. Within five year we'll have users/programmers complaining that: Date fromString: '6-Jan-16'. gave "6 January 1916" and *obviously* it should have given "6 January 2015"!! Rather document that years given with two figures get counted from 1900 (as it used to be in last century) and have people to use four digits for years in the 2000s. Otherwise: 1) We need to get used to the concept of an epoch time for Pharo; or 2) the number to be tested has to be the current year! My 0.019999. . . -- Cesar Rabak Em 16/01/2010 07:57, Stéphane Ducasse < [hidden email] > escreveu: hi guys can one of you give a look at the this fix because I'm confused. Kernel-DavidHotham.538 year < 20 ifTrue: [year := 2000 + year] was year < 10 ifTrue: [year := 2000 + year] both solutions look strange to me. http://code.google.com/p/pharo/issues/detail?id=1749 readFrom: aStream "Read a Date from the stream in any of the forms: (5 April 1982; 5-APR-82) (April 5, 1982) (4/5/82) (5APR82)" | day month year | aStream peek isDigit ifTrue: [day := Integer readFrom: aStream]. [aStream peek isAlphaNumeric] whileFalse: [aStream skip: 1]. aStream peek isLetter ifTrue: ["number/name... or name..." month := (String new: 10) writeStream. [aStream peek isLetter] whileTrue: [month nextPut: aStream next]. month := month contents. day isNil ifTrue: ["name/number..." [aStream peek isAlphaNumeric] whileFalse: [aStream skip: 1]. day := Integer readFrom: aStream]] ifFalse: ["number/number..." month := Month nameOfMonth: day. day := Integer readFrom: aStream]. [aStream peek isAlphaNumeric] whileFalse: [aStream skip: 1]. year := Integer readFrom: aStream. >> year < 20 ifTrue: [year := 2000 + year] ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. ^ self year: year month: month day: day _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Yes
Date fromString: '6-Jan-03' -> 6 January 2003 is also wrong. So david just tried to patch this wrong behavior. Stef On Jan 16, 2010, at 6:02 PM, [hidden email] wrote: > I'm afraid the 'fix' is trying to repair something is not broken! > > Date fromString: '6-Jan-10'. giving "6 January 1910" has not anything wrong. Within five year we'll have users/programmers complaining that: > > Date fromString: '6-Jan-16'. gave "6 January 1916" and *obviously* it should have given "6 January 2015"!! > > Rather document that years given with two figures get counted from 1900 (as it used to be in last century) and have people to use four digits for years in the 2000s. > > Otherwise: > > 1) We need to get used to the concept of an epoch time for Pharo; or > > 2) the number to be tested has to be the current year! > > My 0.019999. . . > > -- > Cesar Rabak > > > Em 16/01/2010 07:57, Stéphane Ducasse < [hidden email] > escreveu: > > > hi guys > > can one of you give a look at the this fix because I'm confused. Kernel-DavidHotham.538 > > year < 20 ifTrue: [year := 2000 + year] > was > year < 10 ifTrue: [year := 2000 + year] > > both solutions look strange to me. > > http://code.google.com/p/pharo/issues/detail?id=1749 > > > readFrom: aStream > "Read a Date from the stream in any of the forms: > (5 April 1982; 5-APR-82) > (April 5, 1982) > (4/5/82) > (5APR82)" > | day month year | > aStream peek isDigit > ifTrue: [day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > aStream peek isLetter > ifTrue: ["number/name... or name..." > month := (String new: 10) writeStream. > [aStream peek isLetter] > whileTrue: [month nextPut: aStream next]. > month := month contents. > day isNil > ifTrue: ["name/number..." > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > day := Integer readFrom: aStream]] > ifFalse: ["number/number..." > month := Month nameOfMonth: day. > day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > year := Integer readFrom: aStream. >>> year < 20 ifTrue: [year := 2000 + year] > ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. > > ^ self > year: year > month: month > day: day > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by csrabak
Well oddly I encounter a simular issue on the iPhone the other week when dealing with Apple's Date Scanner logic.
It seems that if a person entered 1/1/09 that would generate jan 1st, 2009. but 1/1/9 would generate jan 1st, year 1 AD I recalling read a novel set in 1715 where they talked about the year 3, meaning 1703 not 3 AD. But I can't recall anyone this millennium using year 9 to refer to 2009 although they might tap that into a entry field. I note you *must* in this case separate the *correct* way of thinking to the *common mostly correct* way of thinking. Most people you met in the USA would think that 1/1/09 means 2009, NOT 1909. Obviously for fields that *could be* a date that could be > 100 years you would have to enforce four digit years, or something, but defaulting to 1909 versus 2009 might have interesting side-effects. Also expiry date 2014 as 1/1/14 this is not assumed to be 1914. On 2010-01-16, at 9:02 AM, [hidden email] wrote: > I'm afraid the 'fix' is trying to repair something is not broken! > > Date fromString: '6-Jan-10'. giving "6 January 1910" has not anything wrong. Within five year we'll have users/programmers complaining that: > > Date fromString: '6-Jan-16'. gave "6 January 1916" and *obviously* it should have given "6 January 2015"!! > > Rather document that years given with two figures get counted from 1900 (as it used to be in last century) and have people to use four digits for years in the 2000s. > > Otherwise: > > 1) We need to get used to the concept of an epoch time for Pharo; or > > 2) the number to be tested has to be the current year! > > My 0.019999. . . > > -- > Cesar Rabak > > -- =========================================================================== John M. McIntosh <[hidden email]> Twitter: squeaker68882 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com =========================================================================== _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Stéphane Ducasse
I think a solution that can tuned or tweaked to it's application would
be must useful. It's all very well to insist on users using four- digit years, but for processing historical data, for example, one doesn't have the luxury of insisting on the data format, but has to use what is there :-( Perhaps a variable that determines the transition from 19xx to 20xx would be appropriate? David PS- here in Canada we interpret 4/5/xxxx the opposite to the USA. I always use yyyy.mm.dd.hh.ss to the appropriate resolution :-) On 2010-01-16, at 10:42 AM, Stéphane Ducasse <[hidden email]> wrote: > Yes > > Date fromString: '6-Jan-03' > -> 6 January 2003 > > is also wrong. So david just tried to patch this wrong behavior. > > Stef > > > On Jan 16, 2010, at 6:02 PM, [hidden email] wrote: > >> I'm afraid the 'fix' is trying to repair something is not broken! >> >> Date fromString: '6-Jan-10'. giving "6 January 1910" has not >> anything wrong. Within five year we'll have users/programmers >> complaining that: >> >> Date fromString: '6-Jan-16'. gave "6 January 1916" and *obviously* >> it should have given "6 January 2015"!! >> >> Rather document that years given with two figures get counted from >> 1900 (as it used to be in last century) and have people to use four >> digits for years in the 2000s. >> >> Otherwise: >> >> 1) We need to get used to the concept of an epoch time for Pharo; or >> >> 2) the number to be tested has to be the current year! >> >> My 0.019999. . . >> >> -- >> Cesar Rabak >> >> >> Em 16/01/2010 07:57, Stéphane Ducasse < [hidden email] >> > escreveu: >> >> >> hi guys >> >> can one of you give a look at the this fix because I'm confused. >> Kernel-DavidHotham.538 >> >> year < 20 ifTrue: [year := 2000 + year] >> was >> year < 10 ifTrue: [year := 2000 + year] >> >> both solutions look strange to me. >> >> http://code.google.com/p/pharo/issues/detail?id=1749 >> >> >> readFrom: aStream >> "Read a Date from the stream in any of the forms: >> (5 April 1982; 5-APR-82) >> (April 5, 1982) >> (4/5/82) >> (5APR82)" >> | day month year | >> aStream peek isDigit >> ifTrue: [day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> aStream peek isLetter >> ifTrue: ["number/name... or name..." >> month := (String new: 10) writeStream. >> [aStream peek isLetter] >> whileTrue: [month nextPut: aStream next]. >> month := month contents. >> day isNil >> ifTrue: ["name/number..." >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> day := Integer readFrom: aStream]] >> ifFalse: ["number/number..." >> month := Month nameOfMonth: day. >> day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> year := Integer readFrom: aStream. >>>> year < 20 ifTrue: [year := 2000 + year] >> ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. >> >> ^ self >> year: year >> month: month >> day: day >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
But if you deal with incomplete data why not building your own reader and control
it. I think that a library class cannot be tuned to deal with all kind of crazy situation. I would prefer that Date is robust and consistent and that people implement their own custom situation. Does anybody know how it is done in other languages. Stef On Jan 16, 2010, at 9:36 PM, David Harris wrote: > I think a solution that can tuned or tweaked to it's application would > be must useful. It's all very well to insist on users using four- > digit years, but for processing historical data, for example, one > doesn't have the luxury of insisting on the data format, but has to > use what is there :-( > > Perhaps a variable that determines the transition from 19xx to 20xx > would be appropriate? > > David > PS- here in Canada we interpret 4/5/xxxx the opposite to the USA. I > always use yyyy.mm.dd.hh.ss to the appropriate resolution :-) > > > On 2010-01-16, at 10:42 AM, Stéphane Ducasse > <[hidden email]> wrote: > >> Yes >> >> Date fromString: '6-Jan-03' >> -> 6 January 2003 >> >> is also wrong. So david just tried to patch this wrong behavior. >> >> Stef >> >> >> On Jan 16, 2010, at 6:02 PM, [hidden email] wrote: >> >>> I'm afraid the 'fix' is trying to repair something is not broken! >>> >>> Date fromString: '6-Jan-10'. giving "6 January 1910" has not >>> anything wrong. Within five year we'll have users/programmers >>> complaining that: >>> >>> Date fromString: '6-Jan-16'. gave "6 January 1916" and *obviously* >>> it should have given "6 January 2015"!! >>> >>> Rather document that years given with two figures get counted from >>> 1900 (as it used to be in last century) and have people to use four >>> digits for years in the 2000s. >>> >>> Otherwise: >>> >>> 1) We need to get used to the concept of an epoch time for Pharo; or >>> >>> 2) the number to be tested has to be the current year! >>> >>> My 0.019999. . . >>> >>> -- >>> Cesar Rabak >>> >>> >>> Em 16/01/2010 07:57, Stéphane Ducasse < [hidden email] >>>> escreveu: >>> >>> >>> hi guys >>> >>> can one of you give a look at the this fix because I'm confused. >>> Kernel-DavidHotham.538 >>> >>> year < 20 ifTrue: [year := 2000 + year] >>> was >>> year < 10 ifTrue: [year := 2000 + year] >>> >>> both solutions look strange to me. >>> >>> http://code.google.com/p/pharo/issues/detail?id=1749 >>> >>> >>> readFrom: aStream >>> "Read a Date from the stream in any of the forms: >>> (5 April 1982; 5-APR-82) >>> (April 5, 1982) >>> (4/5/82) >>> (5APR82)" >>> | day month year | >>> aStream peek isDigit >>> ifTrue: [day := Integer readFrom: aStream]. >>> [aStream peek isAlphaNumeric] >>> whileFalse: [aStream skip: 1]. >>> aStream peek isLetter >>> ifTrue: ["number/name... or name..." >>> month := (String new: 10) writeStream. >>> [aStream peek isLetter] >>> whileTrue: [month nextPut: aStream next]. >>> month := month contents. >>> day isNil >>> ifTrue: ["name/number..." >>> [aStream peek isAlphaNumeric] >>> whileFalse: [aStream skip: 1]. >>> day := Integer readFrom: aStream]] >>> ifFalse: ["number/number..." >>> month := Month nameOfMonth: day. >>> day := Integer readFrom: aStream]. >>> [aStream peek isAlphaNumeric] >>> whileFalse: [aStream skip: 1]. >>> year := Integer readFrom: aStream. >>>>> year < 20 ifTrue: [year := 2000 + year] >>> ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. >>> >>> ^ self >>> year: year >>> month: month >>> day: day >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Stef,
+1 What do you mean by "other languages?" If dialects of Smalltalk, Dolphin defines a hierarchy of *ToText converters, including Date and Time. When appropriate (as for date and time), they accept a format string that governs how they do left to right (e.g. Date to String) and right to left conversions. Anything that does not match the format raises an error. When faced with garbage data (2-digit dates are a good example of same IMHO), I end up making multiple converters with different formats, find (in sensible order) which one accepts the input, do any post-hoc tests I feel appropriate to validate that it read what I thought it read, and then set about trying to figure out, for example, just what in the DLL 07 means for a date. I am working on a set of Dolphin-like converters as part of my UI package. It's not much yet. If you thought y2k (laziness) was bad, wait for 2037 (design by committee) =:0 Bill -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Stéphane Ducasse Sent: Saturday, January 16, 2010 4:06 PM To: [hidden email] Subject: Re: [Pharo-project] Date fromString: '6-Jan-10' But if you deal with incomplete data why not building your own reader and control it. I think that a library class cannot be tuned to deal with all kind of crazy situation. I would prefer that Date is robust and consistent and that people implement their own custom situation. Does anybody know how it is done in other languages. Stef On Jan 16, 2010, at 9:36 PM, David Harris wrote: > I think a solution that can tuned or tweaked to it's application would > be must useful. It's all very well to insist on users using four- > digit years, but for processing historical data, for example, one > doesn't have the luxury of insisting on the data format, but has to > use what is there :-( > > Perhaps a variable that determines the transition from 19xx to 20xx > would be appropriate? > > David > PS- here in Canada we interpret 4/5/xxxx the opposite to the USA. I > always use yyyy.mm.dd.hh.ss to the appropriate resolution :-) > > > On 2010-01-16, at 10:42 AM, Stéphane Ducasse > <[hidden email]> wrote: > >> Yes >> >> Date fromString: '6-Jan-03' >> -> 6 January 2003 >> >> is also wrong. So david just tried to patch this wrong behavior. >> >> Stef >> >> >> On Jan 16, 2010, at 6:02 PM, [hidden email] wrote: >> >>> I'm afraid the 'fix' is trying to repair something is not broken! >>> >>> Date fromString: '6-Jan-10'. giving "6 January 1910" has not >>> anything wrong. Within five year we'll have users/programmers >>> complaining that: >>> >>> Date fromString: '6-Jan-16'. gave "6 January 1916" and *obviously* >>> it should have given "6 January 2015"!! >>> >>> Rather document that years given with two figures get counted from >>> 1900 (as it used to be in last century) and have people to use four >>> digits for years in the 2000s. >>> >>> Otherwise: >>> >>> 1) We need to get used to the concept of an epoch time for Pharo; or >>> >>> 2) the number to be tested has to be the current year! >>> >>> My 0.019999. . . >>> >>> -- >>> Cesar Rabak >>> >>> >>> Em 16/01/2010 07:57, Stéphane Ducasse < [hidden email] >>>> escreveu: >>> >>> >>> hi guys >>> >>> can one of you give a look at the this fix because I'm confused. >>> Kernel-DavidHotham.538 >>> >>> year < 20 ifTrue: [year := 2000 + year] was year < 10 ifTrue: [year >>> := 2000 + year] >>> >>> both solutions look strange to me. >>> >>> http://code.google.com/p/pharo/issues/detail?id=1749 >>> >>> >>> readFrom: aStream >>> "Read a Date from the stream in any of the forms: >>> (5 April 1982; 5-APR-82) >>> (April 5, 1982) >>> (4/5/82) >>> (5APR82)" >>> | day month year | >>> aStream peek isDigit >>> ifTrue: [day := Integer readFrom: aStream]. >>> [aStream peek isAlphaNumeric] >>> whileFalse: [aStream skip: 1]. >>> aStream peek isLetter >>> ifTrue: ["number/name... or name..." >>> month := (String new: 10) writeStream. >>> [aStream peek isLetter] >>> whileTrue: [month nextPut: aStream next]. >>> month := month contents. >>> day isNil >>> ifTrue: ["name/number..." >>> [aStream peek isAlphaNumeric] >>> whileFalse: [aStream skip: 1]. >>> day := Integer readFrom: aStream]] >>> ifFalse: ["number/number..." >>> month := Month nameOfMonth: day. >>> day := Integer readFrom: aStream]. >>> [aStream peek isAlphaNumeric] >>> whileFalse: [aStream skip: 1]. >>> year := Integer readFrom: aStream. >>>>> year < 20 ifTrue: [year := 2000 + year] >>> ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. >>> >>> ^ self >>> year: year >>> month: month >>> day: day >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >>> >>> >>> _______________________________________________ >>> Pharo-project mailing list >>> [hidden email] >>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project >> >> >> _______________________________________________ >> Pharo-project mailing list >> [hidden email] >> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by johnmci
All these cases only reinforce IMNSHO that the specific 'user' (in fact programmer) input/data entry/conversion routine should be responsible for the critique of the string and we should accept only well formed strings (some standards exist, like RFC-822).
Otherwise we'll end up with a lot of quasi similar methods each slightly different to accommodate some very specific need and the effort to document and maintain this will grow bigger than our resources. my 0.001999 -- Cesar Rabak Em 16/01/2010 18:09, John M McIntosh < [hidden email] > escreveu: Well oddly I encounter a simular issue on the iPhone the other week when dealing with Apple's Date Scanner logic. It seems that if a person entered 1/1/09 that would generate jan 1st, 2009. but 1/1/9 would generate jan 1st, year 1 AD I recalling read a novel set in 1715 where they talked about the year 3, meaning 1703 not 3 AD. But I can't recall anyone this millennium using year 9 to refer to 2009 although they might tap that into a entry field. I note you *must* in this case separate the *correct* way of thinking to the *common mostly correct* way of thinking. Most people you met in the USA would think that 1/1/09 means 2009, NOT 1909. Obviously for fields that *could be* a date that could be > 100 years you would have to enforce four digit years, or something, but defaulting to 1909 versus 2009 might have interesting side-effects. Also expiry date 2014 as 1/1/14 this is not assumed to be 1914. On 2010-01-16, at 9:02 AM, [hidden email] wrote: > I'm afraid the 'fix' is trying to repair something is not broken! > > Date fromString: '6-Jan-10'. giving "6 January 1910" has not anything wrong. Within five year we'll have users/programmers complaining that: > > Date fromString: '6-Jan-16'. gave "6 January 1916" and *obviously* it should have given "6 January 2015"!! > > Rather document that years given with two figures get counted from 1900 (as it used to be in last century) and have people to use four digits for years in the 2000s. > > Otherwise: > > 1) We need to get used to the concept of an epoch time for Pharo; or > > 2) the number to be tested has to be the current year! > > My 0.019999. . . > > -- > Cesar Rabak > > -- =========================================================================== John M. McIntosh Twitter: squeaker68882 Corporate Smalltalk Consulting Ltd. http://www.smalltalkconsulting.com =========================================================================== _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Stéphane Ducasse
This seems to have been more controversial than I expected. I'm going to
write this post in which I will: - try to make the case for supporting 2-digit years in some sensible way - answer some of the counter-arguments and questions that people have raised ... at which point I intend to withdraw, and the community will no doubt reach some sensible conclusion. David The case for accepting 2-digit years is, I think, very simple: it's useful. Like it or not, data sometimes comes that way and this has to be dealt with. Granted, this means that we have to introduce some heuristic to guess which century is intended. However, I don't see that this is so bad. Amongst the possible approaches are: - Following the Posix standard for strptime, the date is assumed to be between 1969 and 2068 (see http://www.opengroup.org/onlinepubs/009695399/functions/strptime.html). Python goes this way, for example. - the date is assumed to be between 80 years in the past and 20 years in the future (eg Java's SimpleDateFormat, see http://java.sun.com/javase/7/docs/api/java/text/SimpleDateFormat.html) - allow the 100 year-period in which 2-digit years will be placed to be specified (eg the Java SimpleDateFormat also allows this) Now I will address some concerns that people have raised. The first is essentially aesthetic: this kind of cleverness should not be there, and everyone should always use four-digit years. I've some sympathy with this, and it would be wonderful if everyone was as principled as we are. But, alas, it is not so! My judgment is that it's worth a little ugliness to be able to deal with the common case of two-digit years. A second objection was that the code was right all along: '6-Jan-10' should correctly be parsed as 6th January 1910. This seems to me peculiar. I understand a position that says that two-digit years should not be accepted at all, but arguing that they should be accepted and should be interpreted to be 100 or more years ago... well, this seems unlikely to be the most useful approach. A third objection was that this is the thin end of the wedge: if we start accepting two-digit years, who knows where the madness will end? We will have to deal with all kinds of crazy situation! I think that this objection is just wrong. Noone is arguing for three-digit dates, or hexadecimal dates, or any other crazy stuff. Two-digits are commonplace. Other crazy things are not commonplace. Let's draw the line where it makes sense to draw the line: to my mind handling two-digit years clearly falls on the 'useful' side and not the 'crazy' side. Finally, I must note the irony of one poster declaring that we should only accept well-formed strings, and pointing us at RFC822 for reference. Years in RFC822 are defined to have two digits. The RFC does not say what century they should fall in. (This has been obsoleted by RFC2822, which uses four-digit years). "Stéphane Ducasse" <[hidden email]> wrote in message news:[hidden email]... > hi guys > > can one of you give a look at the this fix because I'm confused. > Kernel-DavidHotham.538 > > year < 20 ifTrue: [year := 2000 + year] > was > year < 10 ifTrue: [year := 2000 + year] > > both solutions look strange to me. > > http://code.google.com/p/pharo/issues/detail?id=1749 > > > readFrom: aStream > "Read a Date from the stream in any of the forms: > <day> <monthName> <year> (5 April 1982; 5-APR-82) > <monthName> <day> <year> (April 5, 1982) > <monthNumber> <day> <year> (4/5/82) > <day><monthName><year> (5APR82)" > | day month year | > aStream peek isDigit > ifTrue: [day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > aStream peek isLetter > ifTrue: ["number/name... or name..." > month := (String new: 10) writeStream. > [aStream peek isLetter] > whileTrue: [month nextPut: aStream next]. > month := month contents. > day isNil > ifTrue: ["name/number..." > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > day := Integer readFrom: aStream]] > ifFalse: ["number/number..." > month := Month nameOfMonth: day. > day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > year := Integer readFrom: aStream. >>> year < 20 ifTrue: [year := 2000 + year] > ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. > > ^ self > year: year > month: month > day: day _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Thanks david
Now I think that we could have Date readFromTwoDigitYear: or something like that. And that Date readFromString: always reads a coherent 4 digits year. > This seems to have been more controversial than I expected. I'm going to > write this post in which I will: > > - try to make the case for supporting 2-digit years in some sensible way > - answer some of the counter-arguments and questions that people have > raised > > ... at which point I intend to withdraw, and the community will no doubt > reach some sensible conclusion. > > David > > The case for accepting 2-digit years is, I think, very simple: it's useful. > Like it or not, data sometimes comes that way and this has to be dealt with. > > Granted, this means that we have to introduce some heuristic to guess which > century is intended. But we can have clever vs. explicit readFrom: '06-jan-30' what is it 1930 or 2030 readFrom: '06-jan-30' usingBaseForYear: 2000 -> 2030 I prefer explicit because it produces more robust code. > However, I don't see that this is so bad. Amongst the > possible approaches are: > > - Following the Posix standard for strptime, the date is assumed to be > between 1969 and 2068 (see > http://www.opengroup.org/onlinepubs/009695399/functions/strptime.html). > Python goes this way, for example. > > - the date is assumed to be between 80 years in the past and 20 years in > the future (eg Java's SimpleDateFormat, see > http://java.sun.com/javase/7/docs/api/java/text/SimpleDateFormat.html) > > - allow the 100 year-period in which 2-digit years will be placed to be > specified (eg the Java SimpleDateFormat also allows this) > • Year: For formatting, if the number of pattern letters is 2, the year is truncated to 2 digits; otherwise it is interpreted as a number. For parsing, if the number of pattern letters is more than 2, the year is interpreted literally, regardless of the number of digits. So using the pattern "MM/dd/yyyy", "01/11/12" parses to Jan 11, 12 A.D. For parsing with the abbreviated year pattern ("y" or "yy"), SimpleDateFormat must interpret the abbreviated year relative to some century. It does this by adjusting dates to be within 80 years before and 20 years after the time the SimpleDateFormat instance is created. For example, using a pattern of "MM/dd/yy" and a SimpleDateFormat instance created on Jan 1, 1997, the string "01/11/12" would be interpreted as Jan 11, 2012 while the string "05/04/64" would be interpreted as May 4, 1964. During parsing, only strings consisting of exactly two digits, as defined by Character.isDigit(char), will be parsed into the default century. Any other numeric string, such as a one digit string, a three or more digit string, or a two digit string that isn't all digits (for example, "-1"), is interpreted literally. So "01/02/3" or "01/02/003" are parsed, using the same pattern, as Jan 2, 3 AD. Likewise, "01/02/-3" is parsed as Jan 2, 4 BC. http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html > > > Now I will address some concerns that people have raised. > > The first is essentially aesthetic: this kind of cleverness should not be > there, and everyone should always use four-digit years. I've some sympathy > with this, and it would be wonderful if everyone was as principled as we > are. But, alas, it is not so! My judgment is that it's worth a little > ugliness to be able to deal with the common case of two-digit years. > > A second objection was that the code was right all along: '6-Jan-10' should > correctly be parsed as 6th January 1910. This seems to me peculiar. I > understand a position that says that two-digit years should not be accepted > at all, but arguing that they should be accepted and should be interpreted > to be 100 or more years ago... well, this seems unlikely to be the most > useful approach. > > A third objection was that this is the thin end of the wedge: if we start > accepting two-digit years, who knows where the madness will end? We will > have to deal with all kinds of crazy situation! > > I think that this objection is just wrong. Noone is arguing for three-digit > dates, or hexadecimal dates, or any other crazy stuff. Two-digits are > commonplace. Other crazy things are not commonplace. Let's draw the line > where it makes sense to draw the line: to my mind handling two-digit years > clearly falls on the 'useful' side and not the 'crazy' side. > > Finally, I must note the irony of one poster declaring that we should only > accept well-formed strings, and pointing us at RFC822 for reference. Years > in RFC822 are defined to have two digits. The RFC does not say what century > they should fall in. (This has been obsoleted by RFC2822, which uses > four-digit years). > > > > "Stéphane Ducasse" <[hidden email]> wrote in > message news:[hidden email]... >> hi guys >> >> can one of you give a look at the this fix because I'm confused. >> Kernel-DavidHotham.538 >> >> year < 20 ifTrue: [year := 2000 + year] >> was >> year < 10 ifTrue: [year := 2000 + year] >> >> both solutions look strange to me. >> >> http://code.google.com/p/pharo/issues/detail?id=1749 >> >> >> readFrom: aStream >> "Read a Date from the stream in any of the forms: >> <day> <monthName> <year> (5 April 1982; 5-APR-82) >> <monthName> <day> <year> (April 5, 1982) >> <monthNumber> <day> <year> (4/5/82) >> <day><monthName><year> (5APR82)" >> | day month year | >> aStream peek isDigit >> ifTrue: [day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> aStream peek isLetter >> ifTrue: ["number/name... or name..." >> month := (String new: 10) writeStream. >> [aStream peek isLetter] >> whileTrue: [month nextPut: aStream next]. >> month := month contents. >> day isNil >> ifTrue: ["name/number..." >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> day := Integer readFrom: aStream]] >> ifFalse: ["number/number..." >> month := Month nameOfMonth: day. >> day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> year := Integer readFrom: aStream. >>>> year < 20 ifTrue: [year := 2000 + year] >> ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. >> >> ^ self >> year: year >> month: month >> day: day > > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by David Hotham
David,
There is a simpler solution: parse 6-Jan-10 as exactly what it *says* - 2000 years ago. Then let the programmer who got stuck with the garbage data figure it out. At a minimum, one should have to provide a pivot date for the conversion to happen. In my world, the most common examples of two digit years come from a computer system built and operated during the 1990s. Sad, huh. It was a pearl of user interface design, and otherwise a complete technical disaster that worked more by accident than design, but the data contained in it is priceless. An appropriate pivot date would be the date of the document, not "today." I suspect that the problem of separate context will be wide spread. Bill -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of David Hotham Sent: Sunday, January 17, 2010 9:18 AM To: [hidden email] Subject: Re: [Pharo-project] Date fromString: '6-Jan-10' This seems to have been more controversial than I expected. I'm going to write this post in which I will: - try to make the case for supporting 2-digit years in some sensible way - answer some of the counter-arguments and questions that people have raised ... at which point I intend to withdraw, and the community will no doubt reach some sensible conclusion. David The case for accepting 2-digit years is, I think, very simple: it's useful. Like it or not, data sometimes comes that way and this has to be dealt with. Granted, this means that we have to introduce some heuristic to guess which century is intended. However, I don't see that this is so bad. Amongst the possible approaches are: - Following the Posix standard for strptime, the date is assumed to be between 1969 and 2068 (see http://www.opengroup.org/onlinepubs/009695399/functions/strptime.html). Python goes this way, for example. - the date is assumed to be between 80 years in the past and 20 years in the future (eg Java's SimpleDateFormat, see http://java.sun.com/javase/7/docs/api/java/text/SimpleDateFormat.html) - allow the 100 year-period in which 2-digit years will be placed to be specified (eg the Java SimpleDateFormat also allows this) Now I will address some concerns that people have raised. The first is essentially aesthetic: this kind of cleverness should not be there, and everyone should always use four-digit years. I've some sympathy with this, and it would be wonderful if everyone was as principled as we are. But, alas, it is not so! My judgment is that it's worth a little ugliness to be able to deal with the common case of two-digit years. A second objection was that the code was right all along: '6-Jan-10' should correctly be parsed as 6th January 1910. This seems to me peculiar. I understand a position that says that two-digit years should not be accepted at all, but arguing that they should be accepted and should be interpreted to be 100 or more years ago... well, this seems unlikely to be the most useful approach. A third objection was that this is the thin end of the wedge: if we start accepting two-digit years, who knows where the madness will end? We will have to deal with all kinds of crazy situation! I think that this objection is just wrong. Noone is arguing for three-digit dates, or hexadecimal dates, or any other crazy stuff. Two-digits are commonplace. Other crazy things are not commonplace. Let's draw the line where it makes sense to draw the line: to my mind handling two-digit years clearly falls on the 'useful' side and not the 'crazy' side. Finally, I must note the irony of one poster declaring that we should only accept well-formed strings, and pointing us at RFC822 for reference. Years in RFC822 are defined to have two digits. The RFC does not say what century they should fall in. (This has been obsoleted by RFC2822, which uses four-digit years). "Stéphane Ducasse" <[hidden email]> wrote in message news:[hidden email]... > hi guys > > can one of you give a look at the this fix because I'm confused. > Kernel-DavidHotham.538 > > year < 20 ifTrue: [year := 2000 + year] > was > year < 10 ifTrue: [year := 2000 + year] > > both solutions look strange to me. > > http://code.google.com/p/pharo/issues/detail?id=1749 > > > readFrom: aStream > "Read a Date from the stream in any of the forms: > <day> <monthName> <year> (5 April 1982; 5-APR-82) > <monthName> <day> <year> (April 5, 1982) > <monthNumber> <day> <year> (4/5/82) > <day><monthName><year> (5APR82)" > | day month year | > aStream peek isDigit > ifTrue: [day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > aStream peek isLetter > ifTrue: ["number/name... or name..." > month := (String new: 10) writeStream. > [aStream peek isLetter] > whileTrue: [month nextPut: aStream next]. > month := month contents. > day isNil > ifTrue: ["name/number..." > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > day := Integer readFrom: aStream]] > ifFalse: ["number/number..." > month := Month nameOfMonth: day. > day := Integer readFrom: aStream]. > [aStream peek isAlphaNumeric] > whileFalse: [aStream skip: 1]. > year := Integer readFrom: aStream. >>> year < 20 ifTrue: [year := 2000 + year] > ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. > > ^ self > year: year > month: month > day: day _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by Stéphane Ducasse
Stef,
I don't care too much, and I have described how I will handle it if it again arises in my work. To do what you are describing, there needs to be a second argument specifying the pivot year for the cleanup/guess. A useful variant would be to specify instead the date on which the data was recorded, with an algorithm to arrive at a likely pivot date. Bill -----Original Message----- From: [hidden email] [mailto:[hidden email]] On Behalf Of Stéphane Ducasse Sent: Sunday, January 17, 2010 9:36 AM To: [hidden email] Subject: Re: [Pharo-project] Date fromString: '6-Jan-10' Thanks david Now I think that we could have Date readFromTwoDigitYear: or something like that. And that Date readFromString: always reads a coherent 4 digits year. > This seems to have been more controversial than I expected. I'm going > to write this post in which I will: > > - try to make the case for supporting 2-digit years in some sensible > way > - answer some of the counter-arguments and questions that people have > raised > > ... at which point I intend to withdraw, and the community will no > doubt reach some sensible conclusion. > > David > > The case for accepting 2-digit years is, I think, very simple: it's useful. > Like it or not, data sometimes comes that way and this has to be dealt with. > > Granted, this means that we have to introduce some heuristic to guess > which century is intended. But we can have clever vs. explicit readFrom: '06-jan-30' what is it 1930 or 2030 readFrom: '06-jan-30' usingBaseForYear: 2000 -> 2030 I prefer explicit because it produces more robust code. > However, I don't see that this is so bad. Amongst the possible > approaches are: > > - Following the Posix standard for strptime, the date is assumed to > be between 1969 and 2068 (see > http://www.opengroup.org/onlinepubs/009695399/functions/strptime.html). > Python goes this way, for example. > > - the date is assumed to be between 80 years in the past and 20 years > in the future (eg Java's SimpleDateFormat, see > http://java.sun.com/javase/7/docs/api/java/text/SimpleDateFormat.html) > > - allow the 100 year-period in which 2-digit years will be placed to > be specified (eg the Java SimpleDateFormat also allows this) > . Year: For formatting, if the number of pattern letters is 2, the year is truncated to 2 digits; otherwise it is interpreted as a number. For parsing, if the number of pattern letters is more than 2, the year is interpreted literally, regardless of the number of digits. So using the pattern "MM/dd/yyyy", "01/11/12" parses to Jan 11, 12 A.D. For parsing with the abbreviated year pattern ("y" or "yy"), SimpleDateFormat must interpret the abbreviated year relative to some century. It does this by adjusting dates to be within 80 years before and 20 years after the time the SimpleDateFormat instance is created. For example, using a pattern of "MM/dd/yy" and a SimpleDateFormat instance created on Jan 1, 1997, the string "01/11/12" would be interpreted as Jan 11, 2012 while the string "05/04/64" would be interpreted as May 4, 1964. During parsing, only strings consisting of exactly two digits, as defined by Character.isDigit(char), will be parsed into the default century. Any other numeric string, such as a one digit string, a three or more digit string, or a two digit string that isn't all digits (for example, "-1"), is interpreted literally. So "01/02/3" or "01/02/003" are parsed, using the same pattern, as Jan 2, 3 AD. Likewise, "01/02/-3" is parsed as Jan 2, 4 BC. http://java.sun.com/j2se/1.4.2/docs/api/java/text/SimpleDateFormat.html > > > Now I will address some concerns that people have raised. > > The first is essentially aesthetic: this kind of cleverness should not > be there, and everyone should always use four-digit years. I've some > sympathy with this, and it would be wonderful if everyone was as > principled as we are. But, alas, it is not so! My judgment is that > it's worth a little ugliness to be able to deal with the common case of two-digit years. > > A second objection was that the code was right all along: '6-Jan-10' > should correctly be parsed as 6th January 1910. This seems to me > peculiar. I understand a position that says that two-digit years > should not be accepted at all, but arguing that they should be > accepted and should be interpreted to be 100 or more years ago... > well, this seems unlikely to be the most useful approach. > > A third objection was that this is the thin end of the wedge: if we > start accepting two-digit years, who knows where the madness will end? > We will have to deal with all kinds of crazy situation! > > I think that this objection is just wrong. Noone is arguing for > three-digit dates, or hexadecimal dates, or any other crazy stuff. > Two-digits are commonplace. Other crazy things are not commonplace. > Let's draw the line where it makes sense to draw the line: to my mind > handling two-digit years clearly falls on the 'useful' side and not the 'crazy' side. > > Finally, I must note the irony of one poster declaring that we should > only accept well-formed strings, and pointing us at RFC822 for > reference. Years in RFC822 are defined to have two digits. The RFC > does not say what century they should fall in. (This has been > obsoleted by RFC2822, which uses four-digit years). > > > > "Stéphane Ducasse" <[hidden email]> wrote in message > news:[hidden email]... >> hi guys >> >> can one of you give a look at the this fix because I'm confused. >> Kernel-DavidHotham.538 >> >> year < 20 ifTrue: [year := 2000 + year] was year < 10 ifTrue: [year >> := 2000 + year] >> >> both solutions look strange to me. >> >> http://code.google.com/p/pharo/issues/detail?id=1749 >> >> >> readFrom: aStream >> "Read a Date from the stream in any of the forms: >> <day> <monthName> <year> (5 April 1982; 5-APR-82) <monthName> <day> >> <year> (April 5, 1982) <monthNumber> <day> <year> (4/5/82) >> <day><monthName><year> (5APR82)" >> | day month year | >> aStream peek isDigit >> ifTrue: [day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> aStream peek isLetter >> ifTrue: ["number/name... or name..." >> month := (String new: 10) writeStream. >> [aStream peek isLetter] >> whileTrue: [month nextPut: aStream next]. >> month := month contents. >> day isNil >> ifTrue: ["name/number..." >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> day := Integer readFrom: aStream]] >> ifFalse: ["number/number..." >> month := Month nameOfMonth: day. >> day := Integer readFrom: aStream]. >> [aStream peek isAlphaNumeric] >> whileFalse: [aStream skip: 1]. >> year := Integer readFrom: aStream. >>>> year < 20 ifTrue: [year := 2000 + year] >> ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. >> >> ^ self >> year: year >> month: month >> day: day > > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by David Hotham
El dom, 17-01-2010 a las 14:17 +0000, David Hotham escribió:
> This seems to have been more controversial than I expected. I'm going to > write this post in which I will: > > - try to make the case for supporting 2-digit years in some sensible way > - answer some of the counter-arguments and questions that people have > raised > > ... at which point I intend to withdraw, and the community will no doubt > reach some sensible conclusion. > > David > > > > > The case for accepting 2-digit years is, I think, very simple: it's useful. > Like it or not, data sometimes comes that way and this has to be dealt with. Yes, but it can be handled in your code, by first massaging the legacy input data and then, after is 4 digits, passing it to the smalltalk code. This way the core library doesn't get dirty with special cases and your app works correctly. So first correct your input and then proccess it. Just like a web app. Don't assume the input data is ok. Cheers > > Granted, this means that we have to introduce some heuristic to guess which > century is intended. However, I don't see that this is so bad. Amongst the > possible approaches are: > > - Following the Posix standard for strptime, the date is assumed to be > between 1969 and 2068 (see > http://www.opengroup.org/onlinepubs/009695399/functions/strptime.html). > Python goes this way, for example. > > - the date is assumed to be between 80 years in the past and 20 years in > the future (eg Java's SimpleDateFormat, see > http://java.sun.com/javase/7/docs/api/java/text/SimpleDateFormat.html) > > - allow the 100 year-period in which 2-digit years will be placed to be > specified (eg the Java SimpleDateFormat also allows this) > > > > Now I will address some concerns that people have raised. > > The first is essentially aesthetic: this kind of cleverness should not be > there, and everyone should always use four-digit years. I've some sympathy > with this, and it would be wonderful if everyone was as principled as we > are. But, alas, it is not so! My judgment is that it's worth a little > ugliness to be able to deal with the common case of two-digit years. > > A second objection was that the code was right all along: '6-Jan-10' should > correctly be parsed as 6th January 1910. This seems to me peculiar. I > understand a position that says that two-digit years should not be accepted > at all, but arguing that they should be accepted and should be interpreted > to be 100 or more years ago... well, this seems unlikely to be the most > useful approach. > > A third objection was that this is the thin end of the wedge: if we start > accepting two-digit years, who knows where the madness will end? We will > have to deal with all kinds of crazy situation! > > I think that this objection is just wrong. Noone is arguing for three-digit > dates, or hexadecimal dates, or any other crazy stuff. Two-digits are > commonplace. Other crazy things are not commonplace. Let's draw the line > where it makes sense to draw the line: to my mind handling two-digit years > clearly falls on the 'useful' side and not the 'crazy' side. > > Finally, I must note the irony of one poster declaring that we should only > accept well-formed strings, and pointing us at RFC822 for reference. Years > in RFC822 are defined to have two digits. The RFC does not say what century > they should fall in. (This has been obsoleted by RFC2822, which uses > four-digit years). > > > > "Stéphane Ducasse" <[hidden email]> wrote in > message news:[hidden email]... > > hi guys > > > > can one of you give a look at the this fix because I'm confused. > > Kernel-DavidHotham.538 > > > > year < 20 ifTrue: [year := 2000 + year] > > was > > year < 10 ifTrue: [year := 2000 + year] > > > > both solutions look strange to me. > > > > http://code.google.com/p/pharo/issues/detail?id=1749 > > > > > > readFrom: aStream > > "Read a Date from the stream in any of the forms: > > <day> <monthName> <year> (5 April 1982; 5-APR-82) > > <monthName> <day> <year> (April 5, 1982) > > <monthNumber> <day> <year> (4/5/82) > > <day><monthName><year> (5APR82)" > > | day month year | > > aStream peek isDigit > > ifTrue: [day := Integer readFrom: aStream]. > > [aStream peek isAlphaNumeric] > > whileFalse: [aStream skip: 1]. > > aStream peek isLetter > > ifTrue: ["number/name... or name..." > > month := (String new: 10) writeStream. > > [aStream peek isLetter] > > whileTrue: [month nextPut: aStream next]. > > month := month contents. > > day isNil > > ifTrue: ["name/number..." > > [aStream peek isAlphaNumeric] > > whileFalse: [aStream skip: 1]. > > day := Integer readFrom: aStream]] > > ifFalse: ["number/number..." > > month := Month nameOfMonth: day. > > day := Integer readFrom: aStream]. > > [aStream peek isAlphaNumeric] > > whileFalse: [aStream skip: 1]. > > year := Integer readFrom: aStream. > >>> year < 20 ifTrue: [year := 2000 + year] > > ifFalse: [ year < 1900 ifTrue: [ year := 1900 + year]]. > > > > ^ self > > year: year > > month: month > > day: day > > > > _______________________________________________ > Pharo-project mailing list > [hidden email] > http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project -- Miguel Cobá http://miguel.leugim.com.mx _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
In reply to this post by David Hotham
Em 17/01/2010 12:17, David Hotham <[hidden email]> escreveu:
> This seems to have been more controversial than I expected. I'm > going to write this post in which I will: > > - try to make the case for supporting 2-digit years in some > - sensible way answer some of the counter-arguments and questions > - that people have > raised > ... at which point I intend to withdraw, and the community will no > doubt reach some sensible conclusion. David, Thanks for expending some time of yours articulating a subject than at first may seem simple but has some complexities hidden due the need to bring the natural language of users closer to the deterministic and rigorous of the computers. > > > > The case for accepting 2-digit years is, I think, very simple: it's > useful. Like it or not, data sometimes comes that way and this has > to be dealt with. I agree with the bulk of the statement about usefulness, the only problem I see is in the details. "data sometimes comes that way" has to be qualified: if we're talking about some batch processing where all the dates are in strings and no human intervention occurs during the import of the data, or if we've the situation where an operator of an interface is entering the data. In the first scenario, IMO the right thing is to have a specific conversion routine for the string to Date object. For the later, since we're talking about Smalltalk, we could always have a popup window asking for the correct interpretation or have appropriate flags to discern how to interpret these strings as dates (IIRC Excel has/had such a setting). So since these details will be also very application specific, I think they belong to application code and not the Core of Pharo. > Granted, this means that we have to introduce some heuristic to > guess which century is intended. However, I don't see that this is > so bad. Yes. It would be not so bad as far as we refrain ourselves to try to encompass (i.e. putting in Core) any approach that other platforms or applications have arrived at. > Amongst the possible approaches are: > - Following the Posix standard for strptime, the date is assumed to > be between 1969 and 2068 (see > http://www.opengroup.org/onlinepubs/009695399/functions/strptime.html). But the danger goes in this: <quote> Note: It is expected that in a future version of IEEE Std 1003.1-2001 the default century inferred from a 2-digit year will change. (This would apply to all commands accepting a 2-digit year as input.) </quote> > Python goes this way, for example. > - the date is assumed to be between 80 years in the past and 20 > years in the future (eg Java's SimpleDateFormat, see > http://java.sun.com/javase/7/docs/api/java/text/SimpleDateFormat.html) > - allow the 100 year-period in which 2-digit years will be placed > to be specified (eg the Java SimpleDateFormat also allows this) > This an interesting advantage (at first) that uses some moving epoch to compute the century, but has IMNHO a terrific disadvantage: it will break unit tests as soon your code crosses the boundary of ten years period (like some code done last year and tested in 2011). > > Now I will address some concerns that people have raised. > The first is essentially aesthetic: this kind of cleverness should > not be there, and everyone should always use four-digit years. I've > some sympathy with this, and it would be wonderful if everyone was > as principled as we are. But, alas, it is not so! My judgment is > that it's worth a little ugliness to be able to deal with the common > case of two-digit years. As complement to 'accepting 2-digit years...", I would say (quoting liberally Stephen Leake) that we should have a programming environment that helps us to write better programs. So with little extra code we can introduce the needed discipline. > A second objection was that the code was right all along: > '6-Jan-10' should correctly be parsed as 6th January 1910. This > seems to me peculiar. I understand a position that says that > two-digit years should not be accepted at all, but arguing that they > should be accepted and should be interpreted to be 100 or more years > ago... well, this seems unlikely to be the most useful approach. This is sensible because is a common epoch date for computers, and in the last century we have accumulated lots of records in computers where these two digit dates had the meaning of 1910. > A third objection was that this is the thin end of the wedge: if we > start accepting two-digit years, who knows where the madness will > end? We will have to deal with all kinds of crazy situation! Yes, if we come to think about it, it is really serious! Within some time period one of us will need to accept the February, 29, 1900 date "because Lotus 1-2-3 compatibility" or perhaps a special case to allow for January, 0, 1900 "due to Microsoft Excel compatibility"!! > I think that this objection is just wrong. Noone is arguing for > three-digit dates, or hexadecimal dates, or any other crazy stuff. In general people do not argue, they introduce these quircks in the systems together with some nice unit tests, some interesting explanation, etc. The point is to have these things in the application code, not in Core of Pharo. > Two-digits are commonplace. Other crazy things are not commonplace. > Let's draw the line where it makes sense to draw the line: to my > mind handling two-digit years clearly falls on the 'useful' side and > not the 'crazy' side. > Finally, I must note the irony of one poster declaring that we > should only accept well-formed strings, and pointing us at RFC822 > for reference. Years in RFC822 are defined to have two digits. The > RFC does not say what century they should fall in. (This has been > obsoleted by RFC2822, which uses four-digit years). Yes, I was thinking of 2822 but my memory still is too impressed with 822 (which the POSIX standard points to to refer to timezones). . .! -- Cesar Rabak _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Quoting [hidden email]:
> ... > > Python goes this way, for example. > > > - the date is assumed to be between 80 years in the past and 20 > > years in the future (eg Java's SimpleDateFormat, see > > http://java.sun.com/javase/7/docs/api/java/text/SimpleDateFormat.html) > > - allow the 100 year-period in which 2-digit years will be placed > > to be specified (eg the Java SimpleDateFormat also allows this) > > > > This an interesting advantage (at first) that uses some moving epoch > to compute the century, but has IMNHO a terrific disadvantage: it will > break unit tests as soon your code crosses the boundary of ten years > period (like some code done last year and tested in 2011). I think this is a *sliding* period: 80 years before the *current* year, and 20 years into the future from the *current* year. So, presently this would be from 1911 to 2030, next year it would be 1912-2031. I would suspect this would cover most uses, and is easy to specify. (Birth dates might be a common exception. ) I agree that using 4-digit years is best for user input. David _______________________________________________ Pharo-project mailing list [hidden email] http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project |
Free forum by Nabble | Edit this page |