On Mon, May 26, 2014 at 02:17:47PM +0000, J. Vuletich (mail lists) wrote:
> Hi David, Folks, > > Quoting "David T. Lewis" <[hidden email]>: > > >I have been working on a variation of class DateAndTime that replaces its > >instance variables (seconds offset jdn nanos) with two instance variables, > >utcMicroseconds to represent microseconds elapsed since the Posix epoch, > >and > >localOffsetSeconds to represent the local time zone offset. When > >instantiating > >the time now, A single call primitiveUtcWithOffset is used to obtain these > >two values atomically as reported by the underlying platform. > > > >There are several advantages to this representation of DateAndTime, the > >most > >important of which is that its magnitude is unambiguous regardless > >of daylight > >savings transitions in local time zones. > > > >This is my attempt to address some historical baggage in Squeak. The VM > >reports time related to the local time zone, and the image attempts to > >convert to UTC (sometimes incorrectly). A UTC based representation makes > >the > >implementation of time zone tables more straightforward (see for example > >the Olson time zone tables in TimeZoneDatabase on SqueakMap). > >... > >Dave > > I very much support this approach. I did a bit of testing of > <primitive: 'primitiveUtcWithOffset'> . I found that on a Mac, with > 'Croquet Closure Cog VM [CoInterpreter VMMaker.oscog-eem.331] Squeak > Cog 4.0.2776' 'Mac OS' 'intel' '1092' (from Eliot's site), the second > element I get (time zone offset) is -140473411. > > The correct value would be -10800, as answered in Windows. I could not > test on Linux yet (could not get the vm to run in Ubuntu 14.04 64 bit > :( ). > > Any clue on what's wrong on Mac OS? > > BTW, which would be the current non-Cog VMs to try? > Oops, I mistakenly said that the Cog VMs could be used. But it looks like there is a regression or code merge problem of some sort. I'm afraid that I was testing with my own locally compiled Cog VM and did not notice the problem. A unix Mac VM from squeakvm.org/unix should demonstrate the correct behavior. CC to vm-dev list: Eliot, the fix for this was here (but it seems to have been overridden by a more recent change): Name: VMMaker.oscog-dtl.286 Author: dtl Time: 4 May 2013, 11:29:25.237 am UUID: 8be237d9-7812-4792-9723-90f9cff0c2e9 Ancestors: VMMaker.oscog-eem.285 Replace broken primitiveUtcWithOffset with a version that works. Dave |
In reply to this post by Louis LaBrunda
On Mon, May 26, 2014 at 10:48:16AM -0400, Louis LaBrunda wrote:
> > On Sun, 25 May 2014 13:48:44 -0400, "David T. Lewis" <[hidden email]> > wrote: > > >I have been working on a variation of class DateAndTime that replaces its > >instance variables (seconds offset jdn nanos) with two instance variables, > >utcMicroseconds to represent microseconds elapsed since the Posix epoch, and > >localOffsetSeconds to represent the local time zone offset. When instantiating > >the time now, A single call primitiveUtcWithOffset is used to obtain these > >two values atomically as reported by the underlying platform. > > > >There are several advantages to this representation of DateAndTime, the most > >important of which is that its magnitude is unambiguous regardless of daylight > >savings transitions in local time zones. > > > Hi Dave, > > May I respectfully ask why localOffsetSeconds (to represent the local time > zone offset) is needed? It seems to me a UTC time is enough. Is there > really a need for the timezone offset the instance was created in? Does > every DateAndTime instance need to carry this offset around with it? I > would think the offset is only needed if one wants to display a date/time > as a local value and then one could get the local offset from the VM or a > program setting the user had previously supplied regardless of where the > computer was setup to run. I guess there might be some historic interest > as to the timezone an instances (or many instances) was created in but one > could just keep that as a separate value. Hi Lou, Good question. In fact, one of the reasons I like the UTC implementation is that it helps clarify the two main responsibilities of DateAndTime. One is to represent time as a magnitude (for duration calculation, etc). The other is to display time in the frame of reference of a local time zone. It is not at all clear to me that those two responsibilies belong in the same class. Dave |
On 26.05.2014, at 17:16, David T. Lewis <[hidden email]> wrote: > On Mon, May 26, 2014 at 10:48:16AM -0400, Louis LaBrunda wrote: >> >> On Sun, 25 May 2014 13:48:44 -0400, "David T. Lewis" <[hidden email]> >> wrote: >> >>> I have been working on a variation of class DateAndTime that replaces its >>> instance variables (seconds offset jdn nanos) with two instance variables, >>> utcMicroseconds to represent microseconds elapsed since the Posix epoch, and >>> localOffsetSeconds to represent the local time zone offset. When instantiating >>> the time now, A single call primitiveUtcWithOffset is used to obtain these >>> two values atomically as reported by the underlying platform. >>> >>> There are several advantages to this representation of DateAndTime, the most >>> important of which is that its magnitude is unambiguous regardless of daylight >>> savings transitions in local time zones. >>> >> Hi Dave, >> >> May I respectfully ask why localOffsetSeconds (to represent the local time >> zone offset) is needed? It seems to me a UTC time is enough. Is there >> really a need for the timezone offset the instance was created in? Does >> every DateAndTime instance need to carry this offset around with it? I >> would think the offset is only needed if one wants to display a date/time >> as a local value and then one could get the local offset from the VM or a >> program setting the user had previously supplied regardless of where the >> computer was setup to run. I guess there might be some historic interest >> as to the timezone an instances (or many instances) was created in but one >> could just keep that as a separate value. > > Hi Lou, > > Good question. In fact, one of the reasons I like the UTC implementation is > that it helps clarify the two main responsibilities of DateAndTime. One is > to represent time as a magnitude (for duration calculation, etc). The other > is to display time in the frame of reference of a local time zone. It is not > at all clear to me that those two responsibilies belong in the same class. > > Dave - Bert - smime.p7s (5K) Download Attachment |
On Mon, 26 May 2014 17:29:19 +0200, Bert Freudenberg <[hidden email]>
wrote: > >On 26.05.2014, at 17:16, David T. Lewis <[hidden email]> wrote: > >> On Mon, May 26, 2014 at 10:48:16AM -0400, Louis LaBrunda wrote: >>> >>> On Sun, 25 May 2014 13:48:44 -0400, "David T. Lewis" <[hidden email]> >>> wrote: >>> >>>> I have been working on a variation of class DateAndTime that replaces its >>>> instance variables (seconds offset jdn nanos) with two instance variables, >>>> utcMicroseconds to represent microseconds elapsed since the Posix epoch, and >>>> localOffsetSeconds to represent the local time zone offset. When instantiating >>>> the time now, A single call primitiveUtcWithOffset is used to obtain these >>>> two values atomically as reported by the underlying platform. >>>> >>>> There are several advantages to this representation of DateAndTime, the most >>>> important of which is that its magnitude is unambiguous regardless of daylight >>>> savings transitions in local time zones. >>>> >>> Hi Dave, >>> >>> May I respectfully ask why localOffsetSeconds (to represent the local time >>> zone offset) is needed? It seems to me a UTC time is enough. Is there >>> really a need for the timezone offset the instance was created in? Does >>> every DateAndTime instance need to carry this offset around with it? I >>> would think the offset is only needed if one wants to display a date/time >>> as a local value and then one could get the local offset from the VM or a >>> program setting the user had previously supplied regardless of where the >>> computer was setup to run. I guess there might be some historic interest >>> as to the timezone an instances (or many instances) was created in but one >>> could just keep that as a separate value. >> >> Hi Lou, >> >> Good question. In fact, one of the reasons I like the UTC implementation is >> that it helps clarify the two main responsibilities of DateAndTime. One is >> to represent time as a magnitude (for duration calculation, etc). The other >> is to display time in the frame of reference of a local time zone. It is not >> at all clear to me that those two responsibilies belong in the same class. >> >> Dave > >We need to be able to distinguish between local and universal time. It would be rather inconvenient if asking a DateAndTime for e.g. the hour would not be made to answer the local hour. Arguably the local time offset could be moved to a subclass, but having a single DateAndTime class is appealing for simplicity reasons, too. >- Bert - I guess a DateAndTime class (without offset) could always answer 0 for the offset and a DateAndTimeWithOffset subclass could carry and answer the offset. Methods could be provided to morph one into the other if desired. Lou ----------------------------------------------------------- Louis LaBrunda Keystone Software Corp. SkypeMe callto://PhotonDemon mailto:[hidden email] http://www.Keystone-Software.com |
In reply to this post by David T. Lewis
Quoting "David T. Lewis" <[hidden email]>: > On Mon, May 26, 2014 at 02:17:47PM +0000, J. Vuletich (mail lists) wrote: >> Hi David, Folks, >> ... >> The correct value would be -10800, as answered in Windows. I could not >> test on Linux yet (could not get the vm to run in Ubuntu 14.04 64 bit >> :( ). >> >> Any clue on what's wrong on Mac OS? >> >> BTW, which would be the current non-Cog VMs to try? >> > > Oops, I mistakenly said that the Cog VMs could be used. But it looks like > there is a regression or code merge problem of some sort. I'm afraid that > I was testing with my own locally compiled Cog VM and did not notice the > problem. > > A unix Mac VM from squeakvm.org/unix should demonstrate the correct behavior. > > CC to vm-dev list: > > Eliot, the fix for this was here (but it seems to have been overridden by > a more recent change): > > Name: VMMaker.oscog-dtl.286 > Author: dtl > Time: 4 May 2013, 11:29:25.237 am > UUID: 8be237d9-7812-4792-9723-90f9cff0c2e9 > Ancestors: VMMaker.oscog-eem.285 > > Replace broken primitiveUtcWithOffset with a version that works. > > Dave Thanks Dave, I could run on 64 bits Ubuntu with the VM from squeakvm.org/unix. I'll try the Mac VM when I get the chance to borrow a Mac again. One thing that seems to be missing is a Windows interpreter with the new primitives, although I don't know if there is a real need for that. Additionally, besides getting rid of <primitive: 137> (primLocalSecondsClock that will overflow in 2037), it would be great to stop using <primitive: 135> (primLocalSecondsClock), that overflows every six days. But for this, we would need a new <primitive: 136> (primSignal:atMilliseconds:) as it uses on the same time base. This would enable a serious simplification of Delay. Cheers, Juan Vuletich |
On Mon, May 26, 2014 at 04:41:10PM +0000, J. Vuletich (mail lists) wrote:
> > Quoting "David T. Lewis" <[hidden email]>: > > >On Mon, May 26, 2014 at 02:17:47PM +0000, J. Vuletich (mail lists) wrote: > >>Hi David, Folks, > >>... > >>The correct value would be -10800, as answered in Windows. I could not > >>test on Linux yet (could not get the vm to run in Ubuntu 14.04 64 bit > >>:( ). > >> > >>Any clue on what's wrong on Mac OS? > >> > >>BTW, which would be the current non-Cog VMs to try? > >> > > > >Oops, I mistakenly said that the Cog VMs could be used. But it looks like > >there is a regression or code merge problem of some sort. I'm afraid that > >I was testing with my own locally compiled Cog VM and did not notice the > >problem. > > > >A unix Mac VM from squeakvm.org/unix should demonstrate the correct > >behavior. > > > >CC to vm-dev list: > > > >Eliot, the fix for this was here (but it seems to have been overridden by > >a more recent change): > > > > Name: VMMaker.oscog-dtl.286 > > Author: dtl > > Time: 4 May 2013, 11:29:25.237 am > > UUID: 8be237d9-7812-4792-9723-90f9cff0c2e9 > > Ancestors: VMMaker.oscog-eem.285 > > > > Replace broken primitiveUtcWithOffset with a version that works. > > > >Dave > > Thanks Dave, > > I could run on 64 bits Ubuntu with the VM from squeakvm.org/unix. I'll > try the Mac VM when I get the chance to borrow a Mac again. > > One thing that seems to be missing is a Windows interpreter with the > new primitives, although I don't know if there is a real need for that. > Ian did a build of the Windows VM that should have the necessary support. Try the Squeak4.1.2.2612 VM from http://squeakvm.org/win32/. One thing to note - if the primitive is not present, DateAndTime will fall back on the old logic, and it should produce reasonable results. > Additionally, besides getting rid of <primitive: 137> > (primLocalSecondsClock that will overflow in 2037), it would be great > to stop using <primitive: 135> (primLocalSecondsClock), that overflows > every six days. But for this, we would need a new <primitive: 136> > (primSignal:atMilliseconds:) as it uses on the same time base. This > would enable a serious simplification of Delay. > I think that Eliot is planning to update Squeak to use the microsecond clock primitive, which removes any 2037 issues. I'm not sure if that would include a change to primSignal:atMilliseconds: Dave |
In reply to this post by ccrraaiigg
Hi Craig, one solution to your question (below).
Also, a counterpart rule to "Please use new threads for new threads" is, "Please don't start new threads for the same thread". :-) e.g., by changing the subject twice there are now three separate "threads" which are really the same thread. (see screenshot) (As an experiment, I've composed this "reply" anew, but C&P'd the subject-line from Craigs last post because I want to see whether Gmail renders this as a new thread or whether it collates it by the subject line). > Short: Yeesh, I'm deleting this thread for sure. :) > > Long: > > This matters to me, because I want to be informed while having > limited time to spend. > > Naturally, I wondered how I might fix this myself, leaving the > delicate sensibilities of my fellow raconteurs untrodden. I'm not sure > what fix would work. A separate Gmail account dedicated to mailing-list reading and responding. It lets the tail wag the dog and still presents threads collated by subject-line.. > Sometimes a reply has nothing to do with the > message to which the responder is replying, and the subject line is > totally different. Sometimes a reply is actually a response, and the > subject line might be the same ("hyperbole is great!"), somewhat > different ("perhaps communication is better [was: 'hyperbole is > great!']"), or totally different. Using simple message ID references and > the conventions of normal conversation, without turning it into an AI > project, seems the best we can manage. I totally agree, in principle! If there's a graph of message id's, the mail clients ought to make use of it over Stringy matching! Just like I wish Eliot and Bert would use the graph of ancestry in Monticello rather than stringy name matching for "branches".. :) |
In reply to this post by Louis LaBrunda
On Mon, May 26, 2014 at 11:12 AM, Louis LaBrunda
<[hidden email]> wrote: > On Mon, 26 May 2014 17:29:19 +0200, Bert Freudenberg <[hidden email]> > wrote: > >> >>On 26.05.2014, at 17:16, David T. Lewis <[hidden email]> wrote: >> >>> On Mon, May 26, 2014 at 10:48:16AM -0400, Louis LaBrunda wrote: >>>> >>>> On Sun, 25 May 2014 13:48:44 -0400, "David T. Lewis" <[hidden email]> >>>> wrote: >>>> >>>>> I have been working on a variation of class DateAndTime that replaces its >>>>> instance variables (seconds offset jdn nanos) with two instance variables, >>>>> utcMicroseconds to represent microseconds elapsed since the Posix epoch, and >>>>> localOffsetSeconds to represent the local time zone offset. When instantiating >>>>> the time now, A single call primitiveUtcWithOffset is used to obtain these >>>>> two values atomically as reported by the underlying platform. >>>>> >>>>> There are several advantages to this representation of DateAndTime, the most >>>>> important of which is that its magnitude is unambiguous regardless of daylight >>>>> savings transitions in local time zones. >>>>> >>>> Hi Dave, >>>> >>>> May I respectfully ask why localOffsetSeconds (to represent the local time >>>> zone offset) is needed? It seems to me a UTC time is enough. Is there >>>> really a need for the timezone offset the instance was created in? Does >>>> every DateAndTime instance need to carry this offset around with it? I >>>> would think the offset is only needed if one wants to display a date/time >>>> as a local value and then one could get the local offset from the VM or a >>>> program setting the user had previously supplied regardless of where the >>>> computer was setup to run. I guess there might be some historic interest >>>> as to the timezone an instances (or many instances) was created in but one >>>> could just keep that as a separate value. >>> >>> Hi Lou, >>> >>> Good question. In fact, one of the reasons I like the UTC implementation is >>> that it helps clarify the two main responsibilities of DateAndTime. One is >>> to represent time as a magnitude (for duration calculation, etc). The other >>> is to display time in the frame of reference of a local time zone. It is not >>> at all clear to me that those two responsibilies belong in the same class. >>> >>> Dave >> >>We need to be able to distinguish between local and universal time. It would be rather inconvenient if asking a DateAndTime for e.g. the hour would not be made to answer the local hour. Arguably the local time offset could be moved to a subclass, but having a single DateAndTime class is appealing for simplicity reasons, too. >>- Bert - > > I guess a DateAndTime class (without offset) could always answer 0 for the > offset and a DateAndTimeWithOffset subclass could carry and answer the > offset. Methods could be provided to morph one into the other if desired. No, one of the core requirements of a DateAndTime has always been to be able to answer the local time. It's fine for its internal representation to be in UTC, but that requirement cannot go away. If you want all UTC DateAndTime's then just specify an offset of 0. |
In reply to this post by Bert Freudenberg
On Mon, May 26, 2014 at 05:29:19PM +0200, Bert Freudenberg wrote:
> On 26.05.2014, at 17:16, David T. Lewis <[hidden email]> wrote: > > On Mon, May 26, 2014 at 10:48:16AM -0400, Louis LaBrunda wrote: > >> > >> May I respectfully ask why localOffsetSeconds (to represent the local time > >> zone offset) is needed? It seems to me a UTC time is enough. Is there > >> really a need for the timezone offset the instance was created in? Does > >> every DateAndTime instance need to carry this offset around with it? I > >> would think the offset is only needed if one wants to display a date/time > >> as a local value and then one could get the local offset from the VM or a > >> program setting the user had previously supplied regardless of where the > >> computer was setup to run. I guess there might be some historic interest > >> as to the timezone an instances (or many instances) was created in but one > >> could just keep that as a separate value. > > > > Hi Lou, > > > > Good question. In fact, one of the reasons I like the UTC implementation is > > that it helps clarify the two main responsibilities of DateAndTime. One is > > to represent time as a magnitude (for duration calculation, etc). The other > > is to display time in the frame of reference of a local time zone. It is not > > at all clear to me that those two responsibilies belong in the same class. > > > > Dave > > We need to be able to distinguish between local and universal time. It would be rather inconvenient if asking a DateAndTime for e.g. the hour would not be made to answer the local hour. Arguably the local time offset could be moved to a subclass, but having a single DateAndTime class is appealing for simplicity reasons, too. > One thing that seemed awkward to me was the question of how to implement #=. I chose to let it be a comparison of just the utcMicroseconds magnitude, ignoring the localOffsetSeconds. That may be the wrong thing to do, although if I think about a DateAndTime as a magnitude, I expect that it should be true that one instance is "greater than or: [equal to]" another when utcMicroseconds are equal, regardless of the local offset. Dave |
In reply to this post by David T. Lewis
Hi Dave, as someone who works with large systems in Squeak, I'm always
interested in _storage efficiency_ as much as execution efficiency. DateAndTime, in particular, is a very common domain element with a high potential for there to be many millions of instances in a given domain model. Apps which have millions of objects with merely a Date attribute can canonicalize them. And, apps which have millions of Time objects can canonicalize them. But LargeInteger's are not easy to canonicalize (e.g., utcMicroseconds). So a database system with millions of DateAndTime's would have to do _two_ reads for every DateAndTime instance instead of just one today (because SmallIntegers are immediate, while LargeIntegers require their own storage buffer). One thing I really like about the current implementation of DateAndTime is how it carefully avoids LargeIntegers by having large-grained "platforms" to arrive at the current time. e.g., each 'jdn' is a chunk of (1000000*60*60*24) microseconds. Your new implementation reflects an increase of 86 BILLION utcMicroseconds for every 1 jdn. Small, all-in-memory benchmarks may show faster with the LI, but I'm concerned that large-scale apps might be significantly impacted in the opposite way.. Would it be possible to re-optimize this part of the representation while still maintaining internal UTC represenation to solve your concern about daylight-savings? Thanks. On Sun, May 25, 2014 at 12:48 PM, David T. Lewis <[hidden email]> wrote: > I have been working on a variation of class DateAndTime that replaces its > instance variables (seconds offset jdn nanos) with two instance variables, > utcMicroseconds to represent microseconds elapsed since the Posix epoch, and > localOffsetSeconds to represent the local time zone offset. When instantiating > the time now, A single call primitiveUtcWithOffset is used to obtain these > two values atomically as reported by the underlying platform. > > There are several advantages to this representation of DateAndTime, the most > important of which is that its magnitude is unambiguous regardless of daylight > savings transitions in local time zones. > > This is my attempt to address some historical baggage in Squeak. The VM > reports time related to the local time zone, and the image attempts to > convert to UTC (sometimes incorrectly). A UTC based representation makes the > implementation of time zone tables more straightforward (see for example > the Olson time zone tables in TimeZoneDatabase on SqueakMap). > > I am attaching the source code as a SAR file that can be loaded into a fully > updated Squeak trunk image. The conversion process is slow, so be patient > if you load it. > > This can be run on either an intepreter VM or Cog, but if you use Cog, please > use a version dated June 2013 or later (the VM in the Squeak 4.5 all-in-one > is fine). > > I am also attaching a copy of LXTestDateAndTimePerformance, which can be > used to compare the performance of some basic DateAndTime functions. > > Performance of the UTC based DateAndTime is generally favorable compared to > the original. Here is what I see on my system (smaller numbers are better). > > LXTestDateAndTimePerformance test results using the original Squeak DateAndTime > on an interpreter VM: > { > #testNow->10143 . > #testEquals->30986 . > #testGreaterThan->80199 . > #testLessThan->75912 . > #testPrintString->10429 . > #testStringAsDateAndTime->44657 > } > > LXTestDateAndTimePerformance test results using the new UTC based DateAndTime > on an interpreter VM: > { > #testNow->6423 . > #testEquals->31625 . > #testGreaterThan->22999 . > #testLessThan->18514 . > #testPrintString->12502 . > #testStringAsDateAndTime->32912 > } > > (CC to Brent Pinkney, author of the excellent Squeak Chronology package) > > Dave > > > > |
In reply to this post by David T. Lewis
Quoting "David T. Lewis" <[hidden email]>: > ... >> > > Ian did a build of the Windows VM that should have the necessary support. Try > the Squeak4.1.2.2612 VM from http://squeakvm.org/win32/. That VM fails for <primitive: 'primitiveUtcWithOffset'> <primitive: 240> primUtcMicrosecondClock <primitive: 241> primLocalMicrosecondClock > One thing to note - if the primitive is not present, DateAndTime will fall > back on the old logic, and it should produce reasonable results. Yes, Cuis already does that... I'd prefer to be able to assume that any future VM will provide <primitive: 'primitiveUtcWithOffset'> and clean the code. Would that asking for too much? >> Additionally, besides getting rid of <primitive: 137> >> (primLocalSecondsClock that will overflow in 2037), it would be great >> to stop using <primitive: 135> (primLocalSecondsClock), that overflows >> every six days. But for this, we would need a new <primitive: 136> >> (primSignal:atMilliseconds:) as it uses on the same time base. This >> would enable a serious simplification of Delay. >> > > I think that Eliot is planning to update Squeak to use the microsecond clock > primitive, which removes any 2037 issues. I'm not sure if that would include > a change to primSignal:atMilliseconds: > > Dave I see. But given that the Delay code is fragile and not trivial at all, the advantages of of relying on a clock that never rolls over would be significant. Please, Eliot, consider this when you work on this code. Cheers, Juan Vuletich |
In reply to this post by Chris Muller-3
2014-05-26 20:09 GMT+02:00 Chris Muller <[hidden email]>: Hi Dave, as someone who works with large systems in Squeak, I'm always That's more or less the Pharo path.
|
In reply to this post by J. Vuletich (mail lists)
On Mon, May 26, 2014 at 06:16:03PM +0000, J. Vuletich (mail lists) wrote:
> > Quoting "David T. Lewis" <[hidden email]>: > > >... > >> > > > >Ian did a build of the Windows VM that should have the necessary support. > >Try > >the Squeak4.1.2.2612 VM from http://squeakvm.org/win32/. > > That VM fails for > <primitive: 'primitiveUtcWithOffset'> > <primitive: 240> primUtcMicrosecondClock > <primitive: 241> primLocalMicrosecondClock > > >One thing to note - if the primitive is not present, DateAndTime will fall > >back on the old logic, and it should produce reasonable results. > > Yes, Cuis already does that... I'd prefer to be able to assume that > any future VM will provide <primitive: 'primitiveUtcWithOffset'> and > clean the code. Would that asking for too much? You should expect the following two primitives to be present in all VMs (comments are from the VMM trunk implementations): primitiveUtcWithOffset "Answer an array with UTC microseconds since the Posix epoch and the current seconds offset from GMT in the local time zone. An empty two element array may be supplied as a parameter. This is a named (not numbered) primitive in the null module (ie the VM)" primitiveUTCMicrosecondClock "Answer the UTC microseconds since the Smalltalk epoch. The value is derived from the Posix epoch (see primitiveUTCMicrosecondClock) with a constant offset corresponding to elapsed microseconds between the two epochs according to RFC 868." Dave > > >>Additionally, besides getting rid of <primitive: 137> > >>(primLocalSecondsClock that will overflow in 2037), it would be great > >>to stop using <primitive: 135> (primLocalSecondsClock), that overflows > >>every six days. But for this, we would need a new <primitive: 136> > >>(primSignal:atMilliseconds:) as it uses on the same time base. This > >>would enable a serious simplification of Delay. > >> > > > >I think that Eliot is planning to update Squeak to use the microsecond > >clock > >primitive, which removes any 2037 issues. I'm not sure if that would > >include > >a change to primSignal:atMilliseconds: > > > >Dave > > I see. But given that the Delay code is fragile and not trivial at > all, the advantages of of relying on a clock that never rolls over > would be significant. > > Please, Eliot, consider this when you work on this code. > > > Cheers, > Juan Vuletich > |
In reply to this post by Chris Muller-3
On 2014-05-26, at 19:31, Chris Muller <[hidden email]> wrote:
> >> Sometimes a reply has nothing to do with the >> message to which the responder is replying, and the subject line is >> totally different. Sometimes a reply is actually a response, and the >> subject line might be the same ("hyperbole is great!"), somewhat >> different ("perhaps communication is better [was: 'hyperbole is >> great!']"), or totally different. Using simple message ID references and >> the conventions of normal conversation, without turning it into an AI >> project, seems the best we can manage. > > I totally agree, in principle! If there's a graph of message id's, > the mail clients ought to make use of it over Stringy matching! Just > like I wish Eliot and Bert would use the graph of ancestry in > Monticello rather than stringy name matching for "branches".. :) For my part I can't see much of an analogy here. - Bert - smime.p7s (5K) Download Attachment |
In reply to this post by Chris Muller-3
On Mon, May 26, 2014 at 01:09:06PM -0500, Chris Muller wrote:
> Hi Dave, as someone who works with large systems in Squeak, I'm always > interested in _storage efficiency_ as much as execution efficiency. > > DateAndTime, in particular, is a very common domain element with a > high potential for there to be many millions of instances in a given > domain model. > > Apps which have millions of objects with merely a Date attribute can > canonicalize them. > And, apps which have millions of Time objects can canonicalize them. > > But LargeInteger's are not easy to canonicalize (e.g., > utcMicroseconds). So a database system with millions of DateAndTime's > would have to do _two_ reads for every DateAndTime instance instead of > just one today (because SmallIntegers are immediate, while > LargeIntegers require their own storage buffer). > > One thing I really like about the current implementation of > DateAndTime is how it carefully avoids LargeIntegers by having > large-grained "platforms" to arrive at the current time. e.g., each > 'jdn' is a chunk of (1000000*60*60*24) microseconds. Your new > implementation reflects an increase of 86 BILLION utcMicroseconds for > every 1 jdn. Understood. But to clarify: The name "utcMicroseconds" reflects only the precision of the time scale, it is not meant to imply what kind of number is used to represent it. In fact, a DateAndTime with nanosecond precision will typically appear as a Fraction rather than a LargeInteger. But microsecond precision is what is currently reported by the primitives, so these are LargeInteger relative to the Posix epoch. For saving to a database, you could certainly shift the time origin and/or limit the precision of the time representation. That's more or less with the current jnd/seconds/nanos does. > > Small, all-in-memory benchmarks may show faster with the LI, but I'm > concerned that large-scale apps might be significantly impacted in the > opposite way.. > > Would it be possible to re-optimize this part of the representation > while still maintaining internal UTC represenation to solve your > concern about daylight-savings? Sure, but just to clarify: This is not something that I am proposing for Squeak trunk. It is a follow up project to my TimeZoneDatabase that I have been meaning to do for the last 15 years. I finally got around to trying it, so I figured I'd go ahead and publish the code :) Dave > > Thanks. > > On Sun, May 25, 2014 at 12:48 PM, David T. Lewis <[hidden email]> wrote: > > I have been working on a variation of class DateAndTime that replaces its > > instance variables (seconds offset jdn nanos) with two instance variables, > > utcMicroseconds to represent microseconds elapsed since the Posix epoch, and > > localOffsetSeconds to represent the local time zone offset. When instantiating > > the time now, A single call primitiveUtcWithOffset is used to obtain these > > two values atomically as reported by the underlying platform. > > > > There are several advantages to this representation of DateAndTime, the most > > important of which is that its magnitude is unambiguous regardless of daylight > > savings transitions in local time zones. > > > > This is my attempt to address some historical baggage in Squeak. The VM > > reports time related to the local time zone, and the image attempts to > > convert to UTC (sometimes incorrectly). A UTC based representation makes the > > implementation of time zone tables more straightforward (see for example > > the Olson time zone tables in TimeZoneDatabase on SqueakMap). > > > > I am attaching the source code as a SAR file that can be loaded into a fully > > updated Squeak trunk image. The conversion process is slow, so be patient > > if you load it. > > > > This can be run on either an intepreter VM or Cog, but if you use Cog, please > > use a version dated June 2013 or later (the VM in the Squeak 4.5 all-in-one > > is fine). > > > > I am also attaching a copy of LXTestDateAndTimePerformance, which can be > > used to compare the performance of some basic DateAndTime functions. > > > > Performance of the UTC based DateAndTime is generally favorable compared to > > the original. Here is what I see on my system (smaller numbers are better). > > > > LXTestDateAndTimePerformance test results using the original Squeak DateAndTime > > on an interpreter VM: > > { > > #testNow->10143 . > > #testEquals->30986 . > > #testGreaterThan->80199 . > > #testLessThan->75912 . > > #testPrintString->10429 . > > #testStringAsDateAndTime->44657 > > } > > > > LXTestDateAndTimePerformance test results using the new UTC based DateAndTime > > on an interpreter VM: > > { > > #testNow->6423 . > > #testEquals->31625 . > > #testGreaterThan->22999 . > > #testLessThan->18514 . > > #testPrintString->12502 . > > #testStringAsDateAndTime->32912 > > } > > > > (CC to Brent Pinkney, author of the excellent Squeak Chronology package) > > > > Dave > > > > > > > > |
In reply to this post by Bert Freudenberg
>> I totally agree, in principle! If there's a graph of message id's,
>> the mail clients ought to make use of it over Stringy matching! Just >> like I wish Eliot and Bert would use the graph of ancestry in >> Monticello rather than stringy name matching for "branches".. :) > > In case this is not just a meant-to-be-funny remark, you should start a new thread for that topic. > > For my part I can't see much of an analogy here. The "In-Reply-To:" hierarchy is to the MCVersionInfo hierarchy as the Subject-Line of an email is to an expanded MCVersionName of saved package versions. Each is a loosey-goosey String-matching strategy replacing a hard, UUID reference hierarchy. |
In reply to this post by Chris Muller-3
On Mon, May 26, 2014 at 01:09:06PM -0500, Chris Muller wrote:
> Hi Dave, as someone who works with large systems in Squeak, I'm always > interested in _storage efficiency_ as much as execution efficiency. > > DateAndTime, in particular, is a very common domain element with a > high potential for there to be many millions of instances in a given > domain model. > > Apps which have millions of objects with merely a Date attribute can > canonicalize them. > And, apps which have millions of Time objects can canonicalize them. > > But LargeInteger's are not easy to canonicalize (e.g., > utcMicroseconds). So a database system with millions of DateAndTime's > would have to do _two_ reads for every DateAndTime instance instead of > just one today (because SmallIntegers are immediate, while > LargeIntegers require their own storage buffer). Hi Chris, I do not have a lot of experience with database systems, so I would like to better understand the issue for storage of large numeric values. I was under the impression that modern SQL databases provide direct support for large integer data types (e.g. bigint for SQL server), and my assumption was that object databases such as Magma or GemStone would make this a non-issue. Why is it that a large (64 bit) integer should be any more or less difficult to persist than a small integer? This may be a dumb question but I am curious. Thanks, Dave |
The issue actually relates purely to Squeak domain models. Consider
the case of an all-in-memory object model in Squeak, with no database involved at all. It is very feasible an app would want to import a flat-file dataset that involves creation a few million DateAndTime instances (along with other objects, of course) to the point where memory constraints begin to be noticed. When dealing with this level of prolifigation-potential of a particular class, and for such a base data-type we don't want to endure changing again, I want us to strongly scrutinize the internal representation. In this case, the use of 'utcMicroseconds' introduces a lot of duplicate bit-patterns in memory that are very hard, if not impossible, to share. The simplest case are two equivalent instances of DateAndTime (read from separate files). Despite being equivalent, their utcMicroseconds' will be separate objects each consuming separate memory space. There is no easy way to share the same 'utcMicroseconds' instance between them. But fully-equivalent DateAndTime's is not even half of the concern -- the high-order bits of every DateAndTime's 'utcMicroseconds' duplicates the same bit pattern, again and again, eating up memory. That doesn't happen when the internal representations are, or can be, canonicalized, as in the case of using SmallIntegers. Yes, Brent's original representation requires two additional slots per instance, but the _contents_ of those slots are SmallIntegers -- shared memory. On Mon, May 26, 2014 at 8:29 PM, David T. Lewis <[hidden email]> wrote: > On Mon, May 26, 2014 at 01:09:06PM -0500, Chris Muller wrote: >> Hi Dave, as someone who works with large systems in Squeak, I'm always >> interested in _storage efficiency_ as much as execution efficiency. >> >> DateAndTime, in particular, is a very common domain element with a >> high potential for there to be many millions of instances in a given >> domain model. >> >> Apps which have millions of objects with merely a Date attribute can >> canonicalize them. >> And, apps which have millions of Time objects can canonicalize them. >> >> But LargeInteger's are not easy to canonicalize (e.g., >> utcMicroseconds). So a database system with millions of DateAndTime's >> would have to do _two_ reads for every DateAndTime instance instead of >> just one today (because SmallIntegers are immediate, while >> LargeIntegers require their own storage buffer). > > Hi Chris, > > I do not have a lot of experience with database systems, so I would > like to better understand the issue for storage of large numeric values. > > I was under the impression that modern SQL databases provide direct > support for large integer data types (e.g. bigint for SQL server), and my > assumption was that object databases such as Magma or GemStone would > make this a non-issue. Why is it that a large (64 bit) integer should > be any more or less difficult to persist than a small integer? > > This may be a dumb question but I am curious. > > Thanks, > Dave > |
2014-05-27 4:30 GMT+02:00 Chris Muller <[hidden email]>: The issue actually relates purely to Squeak domain models. Consider Well, in current 32 bit image format, SmallInteger are not exactly shared, they are immediate values. Each consumes exactly 32 bits. For a compact class like LargePosOrNegInteger, I don't remember what is the header size exactly, but you get 64 bits for data, I would be surprised to see a major difference wrt consumed memory.
Nicolas
|
On Tue, May 27, 2014 at 09:55:33PM +0200, Nicolas Cellier wrote:
> 2014-05-27 4:30 GMT+02:00 Chris Muller <[hidden email]>: > > > The issue actually relates purely to Squeak domain models. Consider > > the case of an all-in-memory object model in Squeak, with no database > > involved at all. It is very feasible an app would want to import a > > flat-file dataset that involves creation a few million DateAndTime > > instances (along with other objects, of course) to the point where > > memory constraints begin to be noticed. > > > > When dealing with this level of prolifigation-potential of a > > particular class, and for such a base data-type we don't want to > > endure changing again, I want us to strongly scrutinize the internal > > representation. > > > > In this case, the use of 'utcMicroseconds' introduces a lot of > > duplicate bit-patterns in memory that are very hard, if not > > impossible, to share. > > > > The simplest case are two equivalent instances of DateAndTime (read > > from separate files). Despite being equivalent, their > > utcMicroseconds' will be separate objects each consuming separate > > memory space. There is no easy way to share the same > > 'utcMicroseconds' instance between them. > > > > But fully-equivalent DateAndTime's is not even half of the concern -- > > the high-order bits of every DateAndTime's 'utcMicroseconds' > > duplicates the same bit pattern, again and again, eating up memory. > > > > That doesn't happen when the internal representations are, or can be, > > canonicalized, as in the case of using SmallIntegers. Yes, Brent's > > original representation requires two additional slots per instance, > > but the _contents_ of those slots are SmallIntegers -- shared memory. > > > > > Well, in current 32 bit image format, SmallInteger are not exactly shared, > they are immediate values. > Each consumes exactly 32 bits. > > For a compact class like LargePosOrNegInteger, I don't remember what is the > header size exactly, but you get 64 bits for data, I would be surprised to > see a major difference wrt consumed memory. > Smalltalk compactClassesArray includes: DateAndTime ==> false Smalltalk compactClassesArray includes: LargePositiveInteger ==> true So for the traditional DateAndTime implementation, an instance requires: 2 words of header (64 bits) 3 words for the small integer jdn/seconds/nanos variables 1 word for the pointer to the offset object, which is an instance of Duration In practice, most instances of DateAndTime within an image will share the same offset object, so for purposes of estimation assume that this takes no extra space. Thus each instance requires 6 words of space in the object memory (maybe a bit more on average if the DateAndTime instances are not sharing the same Duration instance for one reason or another). For the UTC based implementation of DateAndTime, each instance requires: 2 words of header 1 word for the small integer localOffsetSeconds variable 1 word for the pointer to the LargePositiveInteger representing utcMicroSeconds 1 word of header for the large positive integer 2 words of data for the value of the large positive integer Thus each instance requires 7 words of space in the object memory. So there is a difference, but it would probably not be a large effect on overall space utilization, even assuming complete sharing of the offset Duration instances. Dave |
Free forum by Nabble | Edit this page |