Forgive the intrusion, but here's a COBOLer asking for advice. Perhaps
you can direct me to other Smalltalk sites which might have papers in-depth covering the topic. I'm not familiar with the language but do have a copy of Goldberg/Robson "Smalltalk 80". At the current time there are three COBOL OO compilers, each with their own sub-module for handling GUIs plus of course one can access other GUI tools, plus access to C++ or Java, and the latest incarnations can be used with Dot Net. Each also has its own collection hierarchy. Hitachi (only marketed in Japan) and Micro Focus both used as their model the Smalltalk hierarchy. Fujitsu, produced their own model. (The latter name shouldn't be confused with Fujitsu-Siemens, based n Germany, who are in the process of designing a mainframe compiler with OO features Additionally there is IBM Enterprise - but other than basic class design, the compiler has the syntax :- Class-id. MyIBMClass inherits from JavaBase Utilising Java, I don't see that IBM currently has an interest in OO COBOL collection features - who knows - that may change).. Our Standards Committees (ANSI- J4 and ISO WG4) are currently trying to establish a standard, with the intent of portability code-wise between the compilers. They have pared down the number of collection type classes, which as a current user, I find acceptable - with one exception. My own compiler has a SortedCollection which has been totally ignored ! The authors of the draft paper acknowledge that they researched other languages for ideas with specific mention of C, C++, C# and Java. I am more than annoyed that I found no reference to Smalltalk. I intend to include something in my comments like the following :- "Forget about C and Smalltalk for the moment. The Xerox PARC team had it figured whilst James Gosling (Java) was still going through school and university here in Calgary !". The Committee have gone firm on the name Class Iterator - which is parallel to your Enumerator. You of course have methods "do", "select", "reject", "collect", "detect", "inject". From my own compiler I have the following methods to iterate through collections - "do", "select", "reject", "collect"; in addition "quitIteration" which I wouldn't mind betting has also been taken from Smalltalk :-) Now we come to the sticky point on Enumerating/Iterating. Firstly the draft text has just one invoke against the collection, which is the equivalent of "do" that you and I are using. My designs are based on desktop one-user applications. I confess it completely blew me out of the water to think of a collection being "multi-user" accessible just like a flat file or DB system. If I read correctly, there is the statement that a collection can be "shared" which of course is parallel to multi-access of a DB. It is being asked "What if somebody changes or deletes/removes an element during the course of an iteration. Alternatively what if User Y attempts a separate add/change/delete while User X is doing an Iterator(Enumerator) against the same collection ...etc.... ?". With an analyst background I tend to think of problems in design rather than programming terms. As an example, and frankly I doubt there is anything more complex, think of a stock exchange The NYSE "owns" and maintains the database. They allow individual traders to "share" information from the DB via displayable collection(s). Conjecture on my part - but I would guess that while individual users can add/change/delete what they are viewing, their activity is passed on as a "request" to the NYSE application which after validation either accepts and updates the DB or raises an exception. It seems logical to me that it must function in this way. The "master-owner" of the DB must know and control all activities so that freom the updated DB they can "refresh" collections displayed to end-users. Exactly what would you do in Smalltalk to allow collection multi-sharing ? If you have gotten this far - thanks for your patience. Jimmy, Calgary, AB |
"James J. Gavan" wrote:
> > Forgive the intrusion, but here's a COBOLer asking for advice. Perhaps > you can direct me to other Smalltalk sites which might have papers > in-depth covering the topic. I'm not familiar with the language but do > have a copy of Goldberg/Robson "Smalltalk 80". > > At the current time there are three COBOL OO compilers, each with their > own sub-module for handling GUIs plus of course one can access other GUI > tools, plus access to C++ or Java, and the latest incarnations can be > used with Dot Net. Each also has its own collection hierarchy. Hitachi > (only marketed in Japan) and Micro Focus both used as their model the > Smalltalk hierarchy. Fujitsu, produced their own model. (The latter name > shouldn't be confused with Fujitsu-Siemens, based n Germany, who are in > the process of designing a mainframe compiler with OO features > Additionally there is IBM Enterprise - but other than basic class > design, the compiler has the syntax :- > > Class-id. MyIBMClass inherits from JavaBase > > Utilising Java, I don't see that IBM currently has an interest in OO > COBOL collection features - who knows - that may change).. > > Our Standards Committees (ANSI- J4 and ISO WG4) are currently trying to > establish a standard, with the intent of portability code-wise between > the compilers. They have pared down the number of collection type > classes, which as a current user, I find acceptable - with one > exception. My own compiler has a SortedCollection which has been totally > ignored ! > > The authors of the draft paper acknowledge that they researched other > languages for ideas with specific mention of C, C++, C# and Java. I am > more than annoyed that I found no reference to Smalltalk. I intend to > include something in my comments like the following :- > > "Forget about C and Smalltalk for the moment. The Xerox PARC team had it > figured whilst James Gosling (Java) was still going through school and > university here in Calgary !". > > The Committee have gone firm on the name Class Iterator - which is > parallel to your Enumerator. You of course have methods "do", "select", > "reject", "collect", "detect", "inject". From my own compiler I have the > following methods to iterate through collections - "do", "select", > "reject", "collect"; in addition "quitIteration" which I wouldn't mind > betting has also been taken from Smalltalk :-) > > Now we come to the sticky point on Enumerating/Iterating. Firstly the > draft text has just one invoke against the collection, which is the > equivalent of "do" that you and I are using. Yes - everyone does this. But it should ideally be 3 distinct variants - forwardDo: reverseDo: randomlyDo: The second is sometimes defined in terms of the first, and sometimes implemented directly. Depends on the class. The third should always but only be used when implementing an iteration which does not depend on the order in which items are traversed. For example, the result of a #detect:, #select:, reject: and #collect: should not depend on the order of traversal. One might say that if a collection has an ordering, the ordering should be preserved in the result, but that makes the result dependent on the class of the receiver, rather than on the operation. Hence the weakest liberal precondition for a correct result will always be that the receiver is a collection, rather than a particular collection class. On the other hand, #inject: does depend on the traversal, because it accumulates state as the traversal proceeds, and we often want to be able to accumulate non-commutative values. And if one really wants a #detect: that is order dependent, adding #detectFirst: would be a good way to do it. > My designs are based on desktop one-user applications. I confess it > completely blew me out of the water to think of a collection being > "multi-user" accessible just like a flat file or DB system. > > If I read correctly, there is the statement that a collection can be > "shared" which of course is parallel to multi-access of a DB. It is > being asked "What if somebody changes or deletes/removes an element > during the course of an iteration. Right - but this part isn't a multi-user concern. We don't want to specify the sequence of traversal because that would reduce the generality of iteration. So one cannot, in general, know what the sequence is. Therefore, one shouldn't rely on it. Therefore, one shouldn't think that adding to or removing from the collection during a traversal is going to be safe, because it may or may not affect the traversal; i.e. is an added item *always* visited by the traversal already under way? Does the insertion or removal of an item change the index of the remaining (not yet traversed) items? Etc. At the language level, one can choose to specify the order and effect for each and every case, or one can instead specify that the sequence is effectively non-deterministic. Smalltalk does the latter, and the programmer is expected to understand that she must iterate over a copy of the collection if she intends to change the contents during traversal. > ... > Alternatively what if User Y attempts > a separate add/change/delete while User X is doing an > Iterator(Enumerator) against the same collection ...etc.... ?". Java spends some time speaking to these issues, at the language level, whereas Smalltalk does not. The particulars of coordinating access to shared structures are so completely dependent on context that it seems a bit absurd to push this all the way down into collections. Should you synchronize access to the collection as a whole, or should you synchronize individual elements? Do you need one lock, or two? What is the effect of load/store barriers, and when/where are they expected to occur? The performance of such things is intimately connected to knowledge of the context in which they are addressed. Therefore, the language should make it possible to construct specific solutions, but stay silent as to the "general" solution, i.e. there isn't one. > With an analyst background I tend to think of problems in design rather > than programming terms. As an example, and frankly I doubt there is > anything more complex, think of a stock exchange The NYSE "owns" and > maintains the database. They allow individual traders to "share" > information from the DB via displayable collection(s). Conjecture on my > part - but I would guess that while individual users can > add/change/delete what they are viewing, their activity is passed on as > a "request" to the NYSE application which after validation either > accepts and updates the DB or raises an exception. It seems logical to > me that it must function in this way. The "master-owner" of the DB must > know and control all activities so that freom the updated DB they can > "refresh" collections displayed to end-users. > > Exactly what would you do in Smalltalk to allow collection multi-sharing ? Depends on the particular situation, hence it is part of the application, and should not really be a part of the language. The language and/or library could help, by including various access-control "primitives", and perhaps by including a few, very specifically tuned structures, for use in solving specific problems. The real trick to multi-access sharing is in reducing the places where it occurs, so you can optimize the daylights out of those few spots. Providing a general purpose "solution" is likely to work against this, resulting in a proliferation of multi-access sharing points, and a difficult if not impossible to optimize system, when performance problems do arise. "Great. But it doesn't scale. Now what do we do?" You will always need to consider the architecture of the system as a whole. And if performance *isn't* critical, what were you doing optimizing those collections in the first place? Just put a lock around the whole thing - see - good enough. So you just need one general "primitive" for locking access to "any particular thing", so that one can easily produce *correct* solutions. Everyone can use that. When you measure, or are aware of a particularly critical situation, have someone treat the particulars as a specific, critical situation. HTH - feel free to continue/reiterate if it didn't. Regards, -cstb > > If you have gotten this far - thanks for your patience. > > Jimmy, Calgary, AB |
In reply to this post by James J. Gavan-3
James,
> Forgive the intrusion, but here's a COBOLer asking for advice. It's not an intrusion; you have raised an interesting topic. > With an analyst background I tend to think of problems in design rather > than programming terms. As an example, and frankly I doubt there is > anything more complex, think of a stock exchange The NYSE "owns" and > maintains the database. They allow individual traders to "share" > information from the DB via displayable collection(s). Conjecture on my > part - but I would guess that while individual users can > add/change/delete what they are viewing, their activity is passed on as > a "request" to the NYSE application which after validation either > accepts and updates the DB or raises an exception. It seems logical to > me that it must function in this way. The "master-owner" of the DB must > know and control all activities so that freom the updated DB they can > "refresh" collections displayed to end-users. Perhaps an over-simplification, I think you will need to wrap your shared collection in an object that is thread safe and can safely route data into and out of the collection on behalf of mutiple users. Depending on the way you handle networking (e.g. via an ORB or by writing a TCP server), you might have multiple threads pounding on the wrapper, or the requests might be synchronized, queued, and handled one at a time by a single thread. In your example above, there is a DBMS that handles much of the work, and iteration is typically via SQL queries, which are buffered/synchronized/etc. by the DBMS. I _think_ it is safe to say that the queries copy data, which is analogous to doing something like querySomething mutex critical:[ ^( sharedCollection select:[ :x | ... ] ) copy. ] so that each user iterates their own copy of a snapshot at the time they executed the query. Of course, one has to be concerned about the size of the result set. I end up providing a maximum number or matches to keep things manageable in most cases. As always, profiling is your friend, and all the more so when data has to traverse something as "slow" as a network. There might end up being some relevant discussion on the Squeak mailing list - it's too soon to tell how far it will go, but there is someone asking questions about accessing a database via sockets in order to take better advantage of its multi-user capabilities than would be possible via blocking calls to ODBC/ADO. > Exactly what would you do in Smalltalk to allow collection multi-sharing ? Each connection has its own thread (and then some), so I do the best I can to make my systems thread safe, and look for things that run much slower than expected. Have a good one, Bill -- Wilhelm K. Schwab, Ph.D. [hidden email] |
In reply to this post by James J. Gavan-3
>Exactly what would you do in Smalltalk to allow collection
multi-sharing? You can read your Goldberg/Robson at page 261 and you will see a example of a SharedQueue. Doing it with a collection is similar. |
In reply to this post by James J. Gavan-3
James J. Gavan wrote:
> Forgive the intrusion, but here's a COBOLer asking for advice. Perhaps > you can direct me to other Smalltalk sites which might have papers > in-depth covering the topic. I'm not familiar with the language but do > have a copy of Goldberg/Robson "Smalltalk 80". No intrusion. However, you might get a wider range of opinions if you ask one level up, on comp.lang.smalltalk (although I sometimes wonder if there is /anybody/ who reads that group who doesn't at least lurk here...) > Our Standards Committees (ANSI- J4 and ISO WG4) are currently trying to > establish a standard, with the intent of portability code-wise between > the compilers. They have pared down the number of collection type > classes, which as a current user, I find acceptable - with one > exception. My own compiler has a SortedCollection which has been totally > ignored ! Then you have a useful sanity/completeness test for your standard -- can you implement your SortedCollection using only the stuff provided by the standard, and in a way that is essentially indistinguishable from the standard collections (from the user-programmer's POV) ? If so then you aren't loosing /too/ much; if not then you have a great argument to use on the rest of the committee... > The authors of the draft paper acknowledge that they researched other > languages for ideas with specific mention of C, C++, C# and Java. I am > more than annoyed that I found no reference to Smalltalk. I intend to > include something in my comments like the following :- > > "Forget about C and Smalltalk for the moment. The Xerox PARC team had it > figured whilst James Gosling (Java) was still going through school and > university here in Calgary !". "Forget about C and Java" I trust you mean ;-) > The Committee have gone firm on the name Class Iterator - which is > parallel to your Enumerator. You of course have methods "do", "select", > "reject", "collect", "detect", "inject". From my own compiler I have the > following methods to iterate through collections - "do", "select", > "reject", "collect"; in addition "quitIteration" which I wouldn't mind > betting has also been taken from Smalltalk :-) Interestingly, to me anyway, 'inject' is rather widely seen as a poorly chosen name. There's an #inject:-bashing thread on C.L.S every now an then (I've indulged a bit myself), though I have to admit that no one has yet proposed an alternate name that is both snappy and reasonably self-explanatory... BTW, Smalltalk doesn't really have external iterator objects separate from the collections like Java's java.util.Iterator/Enumerator. Technically, ReadStreams are external iterators, but they are not often used that way, ASFAIK, because the internal iterator style is usually sufficient and much more convenient. (and, as a result, there is no standard 'quitIteration' either -- another recurring debate on C.L.S ;-) > Now we come to the sticky point on Enumerating/Iterating. Firstly the > draft text has just one invoke against the collection, which is the > equivalent of "do" that you and I are using. My designs are based on > desktop one-user applications. I confess it completely blew me out of > the water to think of a collection being "multi-user" accessible just > like a flat file or DB system. [...] > Exactly what would you do in Smalltalk to allow collection > multi-sharing ? It's not clear whether you are talking about the dangers of concurrent access to a collection (from more than one thread) which is not limited to problems with iterators, or that of "shared" access from the same thread where code modifies a collection while it is iterating over it. As far as concurrent access goes, I don't think that there is any one-size-fits-all solution. The problem is that what you really want to protect is the integrity of the data /in/ the collection, not just the consistency of any internal data-structures used /by/ the collection. For instance if you using a collection to keep occurrence counts, then incrementing a count cannot be done safely without the co-operation of code external to the collection (the code that knows that incrementing should be an atomic action). For that reason Smalltalk doesn't attempt to protect collections against concurrent access, it is up to the programmer to use a Mutex (or similar) to ensure semantic integrity is preserved. Of course, in Smalltalk (or any other OO) "raw" collections of data aren't normally exposed anyway. The actual Collection object (unless it's transient) will typically be part of the hidden implementation of some "proper" object that provides semantically meaningful, higher-level, operations (like incrementing a word-count), and so there's no particular problem with adding any needed synchronisation at this level. But that's not the end of the story. Other objects with a <Collection>-style API (but which aren't formally instances of the class Collection or its subclasses) may require different ways of maintaining semantic integrity. E.g. a <Collection>-like object that represents data held in a shared SQL-like database cannot ensure integrity only through controls applied within one process; the data is held externally and so the Facade/wrapper must have an API extended with stuff corresponding to whatever mechanism the external facility uses for data-integrity, such as the mess of transactions, optimistic/pessimistic locks, and isolations levels used by SQL databases. So our object that (for most purposes) "looks like" a Collection will also have some way of manipulating SQL transactional semantics. What I'm getting at is that just because its useful to have a common style of access to Collection objects, #at:ifAbsent:, #do:, etc, doesn't and shouldn't (and probably can't) imply or require a common way of ensuring data-integrity. I suspect, though, that the issue in the standard is really about (non-concurrent) modification of a collection while it is being iterated (otherwise I don't see why iteration is an important part of the picture at all). For instance Java2's Collection has the fail-fast feature which allows iterators to check for "illegal" modifications to the object; but those collections are not thread-safe, and, indeed, the fail-fast implementation itself is not thread-safe. In this case, the "Smalltalk Way" is just to expect the programmer to be sensible, and refrain from modifying a Collection while it's being iterated. Not the ideal solution, since it's quite easy to make mistakes. OTOH, the Java2 solution of including explicit checks in the collection/iterator doesn't really seem to buy much (it feels more like a debugging facility to me, rather than a legitimate part of a public API). -- chris |
Chris Uppal wrote:
>James J. Gavan wrote: > > > Thanks to Bill, Chris, cstb, DiegoC for the responses so far. I'll absorb what I have so far, some of which appeals to me personally. Who was it referred me to the Smalltak example ? Well ! Thank God I have a bright young COBOL buddy in Norfolk UK who did Smalltak while at Exeter University - I'll challenge him to activate his memory cells and translate it into COBOL for me :-) Its' ironic - one COBOLer who took a cursory look at Java, (and my bet his depth of knowledge is minimal), scornfully posted a message with a reference to 'one-language Johnnies'. Well, surprised to find that the term FACTORY appears in other languages, I queried its usefulness in comp.object., specifically addressing my question to one person who appeared particularly knowledgeable. Reaction - largely stunned silence that an old warhorse like COBOL had OO; the particular person had rubbed shoulders with COBOL in his youth. To try and illustrate my point, I included a small COBOL example - that blue it ! He gracefully declined - couldn't understand the example. To my surprise a Karl Kistler, head developer for an OO COBOL compiler begin developed by Fujitsu-Siemens in Germany , (not to be confused with Fujitsu), jumped in to try and illustrate the point in 'other terms' they might understand. Still no takers. That particular group, it seems to me, likes to specialize in the abstract, "Design patterns et al....". With their intimate knowledge of theory - it truly surprised me they were stumped by a bread-and-butter question from a doer - programmer ! So language-wise, 'Yer knows what yer knows' . Jimmy, Calgary AB >>Forgive the intrusion, but here's a COBOLer asking for advice. Perhaps >>you can direct me to other Smalltalk sites which might have papers >>in-depth covering the topic. I'm not familiar with the language but do >>have a copy of Goldberg/Robson "Smalltalk 80". >> >> > >No intrusion. However, you might get a wider range of opinions if you ask one >level up, on comp.lang.smalltalk (although I sometimes wonder if there is >/anybody/ who reads that group who doesn't at least lurk here...) > > > > >>Our Standards Committees (ANSI- J4 and ISO WG4) are currently trying to >>establish a standard, with the intent of portability code-wise between >>the compilers. They have pared down the number of collection type >>classes, which as a current user, I find acceptable - with one >>exception. My own compiler has a SortedCollection which has been totally >>ignored ! >> >> > >Then you have a useful sanity/completeness test for your standard -- can you >implement your SortedCollection using only the stuff provided by the standard, >and in a way that is essentially indistinguishable from the standard >collections (from the user-programmer's POV) ? If so then you aren't loosing >/too/ much; if not then you have a great argument to use on the rest of the >committee... > > > > >>The authors of the draft paper acknowledge that they researched other >>languages for ideas with specific mention of C, C++, C# and Java. I am >>more than annoyed that I found no reference to Smalltalk. I intend to >>include something in my comments like the following :- >> >>"Forget about C and Smalltalk for the moment. The Xerox PARC team had it >>figured whilst James Gosling (Java) was still going through school and >>university here in Calgary !". >> >> > >"Forget about C and Java" I trust you mean ;-) > > > > >>The Committee have gone firm on the name Class Iterator - which is >>parallel to your Enumerator. You of course have methods "do", "select", >>"reject", "collect", "detect", "inject". From my own compiler I have the >>following methods to iterate through collections - "do", "select", >>"reject", "collect"; in addition "quitIteration" which I wouldn't mind >>betting has also been taken from Smalltalk :-) >> >> > >Interestingly, to me anyway, 'inject' is rather widely seen as a poorly chosen >name. There's an #inject:-bashing thread on C.L.S every now an then (I've >indulged a bit myself), though I have to admit that no one has yet proposed an >alternate name that is both snappy and reasonably self-explanatory... > >BTW, Smalltalk doesn't really have external iterator objects separate from the >collections like Java's java.util.Iterator/Enumerator. Technically, >ReadStreams are external iterators, but they are not often used that way, >ASFAIK, because the internal iterator style is usually sufficient and much more >convenient. (and, as a result, there is no standard 'quitIteration' either -- >another recurring debate on C.L.S ;-) > > > > >>Now we come to the sticky point on Enumerating/Iterating. Firstly the >>draft text has just one invoke against the collection, which is the >>equivalent of "do" that you and I are using. My designs are based on >>desktop one-user applications. I confess it completely blew me out of >>the water to think of a collection being "multi-user" accessible just >>like a flat file or DB system. >> >> >[...] > > >>Exactly what would you do in Smalltalk to allow collection >>multi-sharing ? >> >> > >It's not clear whether you are talking about the dangers of concurrent access >to a collection (from more than one thread) which is not limited to problems >with iterators, or that of "shared" access from the same thread where code >modifies a collection while it is iterating over it. > >As far as concurrent access goes, I don't think that there is any >one-size-fits-all solution. The problem is that what you really want to >protect is the integrity of the data /in/ the collection, not just the >consistency of any internal data-structures used /by/ the collection. For >instance if you using a collection to keep occurrence counts, then incrementing >a count cannot be done safely without the co-operation of code external to the >collection (the code that knows that incrementing should be an atomic action). > >For that reason Smalltalk doesn't attempt to protect collections against >concurrent access, it is up to the programmer to use a Mutex (or similar) to >ensure semantic integrity is preserved. Of course, in Smalltalk (or any other >OO) "raw" collections of data aren't normally exposed anyway. The actual >Collection object (unless it's transient) will typically be part of the hidden >implementation of some "proper" object that provides semantically meaningful, >higher-level, operations (like incrementing a word-count), and so there's no >particular problem with adding any needed synchronisation at this level. > >But that's not the end of the story. Other objects with a <Collection>-style >API (but which aren't formally instances of the class Collection or its >subclasses) may require different ways of maintaining semantic integrity. E.g. >a <Collection>-like object that represents data held in a shared SQL-like >database cannot ensure integrity only through controls applied within one >process; the data is held externally and so the Facade/wrapper must have an API >extended with stuff corresponding to whatever mechanism the external facility >uses for data-integrity, such as the mess of transactions, >optimistic/pessimistic locks, and isolations levels used by SQL databases. So >our object that (for most purposes) "looks like" a Collection will also have >some way of manipulating SQL transactional semantics. > >What I'm getting at is that just because its useful to have a common style of >access to Collection objects, #at:ifAbsent:, #do:, etc, doesn't and shouldn't >(and probably can't) imply or require a common way of ensuring data-integrity. > >I suspect, though, that the issue in the standard is really about >(non-concurrent) modification of a collection while it is being iterated >(otherwise I don't see why iteration is an important part of the picture at >all). For instance Java2's Collection has the fail-fast feature which allows >iterators to check for "illegal" modifications to the object; but those >collections are not thread-safe, and, indeed, the fail-fast implementation >itself is not thread-safe. In this case, the "Smalltalk Way" is just to >expect the programmer to be sensible, and refrain from modifying a Collection >while it's being iterated. Not the ideal solution, since it's quite easy to >make mistakes. OTOH, the Java2 solution of including explicit checks in the >collection/iterator doesn't really seem to buy much (it feels more like a >debugging facility to me, rather than a legitimate part of a public API). > > -- chris > > > > |
James J. Gavan wrote:
> To try and illustrate my point, I included a small COBOL example - that > blue it ! He gracefully declined - couldn't understand the example. > [...] > With their intimate knowledge of theory - it truly surprised me they > were stumped by a bread-and-butter question from a doer - programmer ! I got interesting in this and went looking for the thread. I must admit that I couldn't make sense of the examples either. It appears that the COBOL world's terminology and frames of reference are completely outwith my experience, so even the explanation of the example didn't enlighten me either. I'm not trying to be impolite or put COBOL down in any way, I just wanted to say that the communication problem isn't limited to theory-oriented types :-( > So language-wise, 'Yer knows what yer knows' . <nods/> -- chris |
Free forum by Nabble | Edit this page |