I have a large file with dates in the format '2001-11-04'. As I couldn't find a method to change to date, I'm doing:
date := self dateFrom: (dateString subStrings: '-'). dateFrom: anArray "anArray should be like #('2005' '09' '06')" ^Date year: anArray first asInteger month: anArray second asInteger day: anArray third asInteger This works, but profiling shows half the work in loading the file is in dateFrom: Is there a more efficient way? Is there a standard method I've missed? Also, I tried to import the .cs of a date parser by Goran here; http://www.nabble.com/Parsing-dates%21-td10078517.html#a10078517 When I unpack the file, then click 'install' in the file list, I get a syntax error. How should I be loading this file? Thanks, ...Stan |
cient date parsing
> > > I have a large file with dates in the format '2001-11-04'. As > I couldn't find a method to change to date, I'm doing: > date := self dateFrom: (dateString subStrings: '-'). > > dateFrom: anArray > "anArray should be like #('2005' '09' '06')" > ^Date year: anArray first asInteger month: anArray > second asInteger day: > anArray third asInteger > > This works, but profiling shows half the work in loading the > file is in > dateFrom: > > Is there a more efficient way? Is there a standard method I've missed? On my Date class, class side, I use the following extention methods for parsing dates. fromString: aString format: aFormat aFormat = #dmy ifTrue: [^ self readEuro: aString readStream ]. aFormat = #iso8601 ifTrue: [^ self readISO: aString readStream]. ^ self fromString: aString readEuro: aStream "Read a Date in euro format dd-mm-yyyy" | day month year | aStream skipSeparators. day := Integer readFrom: aStream. [aStream peek isDigit] whileFalse: [aStream skip: 1]. month := Integer readFrom: aStream. [aStream peek isDigit] whileFalse: [aStream skip: 1]. year := Integer readFrom: aStream. ^ self newDay: day month: month year: year readISO: aStream "Read a Date in ISO-8601 format yyyy-mm-dd" | day month year | aStream skipSeparators. year := Integer readFrom: aStream. [aStream peek isDigit] whileFalse: [aStream skip: 1]. month := Integer readFrom: aStream. [aStream peek isDigit] whileFalse: [aStream skip: 1]. day := Integer readFrom: aStream. ^ self newDay: day month: month year: year Ramon Leon http://onsmalltalk.com _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
In reply to this post by Stan Shepherd
>>>>> "stan" == stan shepherd <[hidden email]> writes:
stan> Is there a more efficient way? Is there a standard method I've missed? One thing you should ensure is that you're not converting the same date twice. Typical logs have thousands of entries all with the same date, and I've seen far too many naive solutions that keep converting, over and over again, '2008-02-15' into Feb 15 2008. Really no point. Implement a simple cache: ^dateCache at: dateString ifAbsent: [dateCache at: dateString put: (Date from: dateString)]. something like that. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
>
> Implement a simple cache: > > ^dateCache at: dateString ifAbsent: > [dateCache at: dateString put: (Date from: dateString)]. > > something like that. > > Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <[hidden email]> Or more idiomatic... ^dateCache at: dateString ifAbsentPut: [Date from: dateString]. Ramon Leon http://onsmalltalk.com _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
>>>>> "Ramon" == Ramon Leon <[hidden email]> writes:
Ramon> Or more idiomatic... Ramon> ^dateCache at: dateString ifAbsentPut: [Date from: dateString]. Yeah, shortly after I posted that, I remembered that. :) Also, for the beginners. you need to initialize dateCache to a Dictionary. the normal way to do that is to have an instance side method called #initialize: initialize super initialize. "NEVER NEVER leave this out" dateCache := Dictionary new. When you save this, squeak will ask "I dunno dateCache", and you should make it an instance variable. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <[hidden email]> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. See PerlTraining.Stonehenge.com for onsite and open-enrollment Perl training! _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
>
> Yeah, shortly after I posted that, I remembered that. :) > > Also, for the beginners. you need to initialize dateCache to > a Dictionary. > the normal way to do that is to have an instance side method called > #initialize: > > initialize > super initialize. "NEVER NEVER leave this out" > dateCache := Dictionary new. > > When you save this, squeak will ask "I dunno dateCache", and > you should make it an instance variable. > > -- > Randal L. Schwartz - Stonehenge Consulting Services, Inc. - Unless you're doing a class side #initialize, in which case you don't want to call super initialize. Ramon Leon http://onsmalltalk.com _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
On 03.04.2008, at 07:49, Ramon Leon wrote:
>> >> Yeah, shortly after I posted that, I remembered that. :) >> >> Also, for the beginners. you need to initialize dateCache to >> a Dictionary. >> the normal way to do that is to have an instance side method called >> #initialize: >> >> initialize >> super initialize. "NEVER NEVER leave this out" >> dateCache := Dictionary new. >> >> When you save this, squeak will ask "I dunno dateCache", and >> you should make it an instance variable. >> >> -- >> Randal L. Schwartz - Stonehenge Consulting Services, Inc. - > > Unless you're doing a class side #initialize, in which case you > don't want > to call super initialize. Indeed, that's one reason why I prefer lazy initialization over #initialize: dateFrom: aString dateCache ifNil: [dateCache := Dictionary new]. ^dateCache at: aString ifAbsentPut: [Date from: aString]. (although the canonical way is to have the dateCache init code in a #dateCache accessor, then always use "self dateCache"). One major plus of lazy initialization is that this supports code upgrades of a running system, where #initialize is usually not run again. Also, if this indeed is for log data with many identical dates in a row you might flush the cache from time to time, or indeed even only check if the next date is the same as the previous ... knowledge of your domain beats any general optimization ;) - Bert - _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
In reply to this post by Randal L. Schwartz
Thanks Randal and Ramon. That's about three times quicker; most of the improvement seems to be in the caching.
...Stan
|
On 03.04.2008, at 09:20, stan shepherd wrote:
> > Thanks Randal and Ramon. That's about three times quicker; most of the > improvement seems to be in the caching. > ...Stan Also, do profile your code. - Bert - _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners |
Hi!
I tried downloading my original attachment and the problem is that it was/is double-gzipped - not sure why. Anyway, attaching once more the ChangeSet. It installs fine in 3.10beta7159 at least. You install it by decompressing it and then use the "install" button in the filelist, or by just using "filein" on the compressed file directly (not sure why there is no install button in that case but anyway). I also include the method comment below to show you what it gives you: readFrom: inputStream pattern: pattern "Read a Date from the stream based on the pattern which can include the tokens: y = A year with 1-n digits yy = A year with 2 digits yyyy = A year with 4 digits m = A month with 1-n digits mm = A month with 2 digits d = A day with 1-n digits dd = A day with 2 digits ...and any other Strings inbetween. Representing $y, $m and $d is done using \y, \m and \d and slash itself with \\. Simple example patterns: 'yyyy-mm-dd' 'yyyymmdd' 'yy.mm.dd' 'ymd' A year given using only two decimals is considered to be >2000." regards, Göran _______________________________________________ Beginners mailing list [hidden email] http://lists.squeakfoundation.org/mailman/listinfo/beginners DateReadFromPattern.cs.gz (1K) Download Attachment |
I've seen the same thing with a few snippets I've downloaded. Thanks for that. ...Stan |
Free forum by Nabble | Edit this page |