Hi All,
the VM now has support for read only objects, and the first logical application is for literals, making boxed floats, strings, symbols, arrays and byte arrays read-only. Doing so is trivial; only two methods need to be modified (see attached). AFAIA little or no core code is broken by making literals read-only. However, when I ran the test suite with read-only literals in place, several tests were broken. These are things like creating null-terminated strings for testing OSProcess, by replacing spaces in string literals with zeros, etc. When we added read-only literals to VisualWorks in something like vw7.0 the balance of opinion was that we should not break user code. Hence we implemented a preference scheme with three states: - By default, an attempt to modify a read-only literal would silently make the literal mutable temporarily,update it and then reenable read-onlyness. - A second state would issue a warning to the transcript, while doing what the default did. - The third state would raise an error Remember that all one needs to do to fix code that modifies literals is to send copy to the literal before modifying the copy, since copies of read-only literals are mutable. I was on the side of only raising an error, but was overruled :-). I wonder what this community thinks. If there are strong views that user code should continue to function unchanged (even though it is modifying literals, which is so so wrong :-) ) I'm happy to implement a scheme similar to that for VisualWorks. But I'd rather not have to and simply have people fix their code within a system that raises an error whenever an attempt is made to modify a literal, and is as simple as possible, but no simpler ;-) Thoughts, opinions? _,,,^..^,,,_ best, Eliot Compiler-readOnlyLiterals.st (2K) Download Attachment |
Hi Eliot,
I have not given this much thought, but just to get the discussion going I'll answer from the POV of maintaining a package like OSProcess. As the maintainer, I would be happier to just fix my code and be done with it, as opposed to trying to answer questions from various people who might or might not have have yet another obscure preference set. But the counter argument is that not all of the useful external packages have active maintainers, and we don't want to set up a situation where those packages cannot be used. I don't know if that would be the case here, but if so we would be well advised to provide a preference to keep those packages useable. If we do add a preference, then I would rather that the default be to simply raise an error (your preferred state of affairs), and the preference would exist only in the interest of allowing existing (unmaintained?) packages to be loaded and run. Off topic for your question, but just for my own understanding - why is it good for a ByteArray to be read-only? I would have thought that it would be just an array of things that happen to be very small positive integers (sorry I am sure this has been discussed before). Dave On Sun, Mar 25, 2018 at 04:18:32PM -0700, Eliot Miranda wrote: > Hi All, > > the VM now has support for read only objects, and the first logical > application is for literals, making boxed floats, strings, symbols, arrays > and byte arrays read-only. Doing so is trivial; only two methods need to > be modified (see attached). AFAIA little or no core code is broken by > making literals read-only. However, when I ran the test suite with > read-only literals in place, several tests were broken. These are things > like creating null-terminated strings for testing OSProcess, by replacing > spaces in string literals with zeros, etc. > > When we added read-only literals to VisualWorks in something like vw7.0 the > balance of opinion was that we should not break user code. Hence we > implemented a preference scheme with three states: > > - By default, an attempt to modify a read-only literal would silently make > the literal mutable temporarily,update it and then reenable read-onlyness. > - A second state would issue a warning to the transcript, while doing what > the default did. > - The third state would raise an error > > Remember that all one needs to do to fix code that modifies literals is to > send copy to the literal before modifying the copy, since copies of > read-only literals are mutable. > > I was on the side of only raising an error, but was overruled :-). I > wonder what this community thinks. If there are strong views that user > code should continue to function unchanged (even though it is modifying > literals, which is so so wrong :-) ) I'm happy to implement a scheme > similar to that for VisualWorks. But I'd rather not have to and simply > have people fix their code within a system that raises an error whenever an > attempt is made to modify a literal, and is as simple as possible, but no > simpler ;-) Thoughts, opinions? > > > _,,,^..^,,,_ > best, Eliot > |
Hi All,
I agree with David and think it should raise an error. If you make it a setting you'll be dealing with the consequences of misuse/misunderstanding in perpetuity. You could instead make the error message that point to a wiki page (or help page) that lists all the ways to fix the 'modifying a literal' error and why its useful and also some RB transformation rules there a person could run against their package/image to preemptively fix them. When there aren't active maintainers for packages there could be a section on the wiki describing how to get a squeaksource admin to upload a user created fixed package. Then unused unmaintained packages aren't holding you back and used ones get fixed. Hope this helps Paul David T. Lewis wrote > Hi Eliot, > > I have not given this much thought, but just to get the discussion going > I'll answer from the POV of maintaining a package like OSProcess. As the > maintainer, I would be happier to just fix my code and be done with it, > as opposed to trying to answer questions from various people who might > or might not have have yet another obscure preference set. > > But the counter argument is that not all of the useful external packages > have active maintainers, and we don't want to set up a situation where > those packages cannot be used. I don't know if that would be the case > here, but if so we would be well advised to provide a preference to keep > those packages useable. > > If we do add a preference, then I would rather that the default be to > simply raise an error (your preferred state of affairs), and the > preference > would exist only in the interest of allowing existing (unmaintained?) > packages to be loaded and run. > > Off topic for your question, but just for my own understanding - why is > it good for a ByteArray to be read-only? I would have thought that it > would be just an array of things that happen to be very small positive > integers (sorry I am sure this has been discussed before). > > Dave > > > On Sun, Mar 25, 2018 at 04:18:32PM -0700, Eliot Miranda wrote: >> Hi All, >> >> the VM now has support for read only objects, and the first logical >> application is for literals, making boxed floats, strings, symbols, >> arrays >> and byte arrays read-only. Doing so is trivial; only two methods need to >> be modified (see attached). AFAIA little or no core code is broken by >> making literals read-only. However, when I ran the test suite with >> read-only literals in place, several tests were broken. These are things >> like creating null-terminated strings for testing OSProcess, by replacing >> spaces in string literals with zeros, etc. >> >> When we added read-only literals to VisualWorks in something like vw7.0 >> the >> balance of opinion was that we should not break user code. Hence we >> implemented a preference scheme with three states: >> >> - By default, an attempt to modify a read-only literal would silently >> make >> the literal mutable temporarily,update it and then reenable >> read-onlyness. >> - A second state would issue a warning to the transcript, while doing >> what >> the default did. >> - The third state would raise an error >> >> Remember that all one needs to do to fix code that modifies literals is >> to >> send copy to the literal before modifying the copy, since copies of >> read-only literals are mutable. >> >> I was on the side of only raising an error, but was overruled :-). I >> wonder what this community thinks. If there are strong views that user >> code should continue to function unchanged (even though it is modifying >> literals, which is so so wrong :-) ) I'm happy to implement a scheme >> similar to that for VisualWorks. But I'd rather not have to and simply >> have people fix their code within a system that raises an error whenever >> an >> attempt is made to modify a literal, and is as simple as possible, but no >> simpler ;-) Thoughts, opinions? >> >> >> _,,,^..^,,,_ >> best, Eliot > > >> -- Sent from: http://forum.world.st/Squeak-Dev-f45488.html |
In reply to this post by David T. Lewis
Hi David,
On Sun, Mar 25, 2018 at 6:21 PM, David T. Lewis <[hidden email]> wrote: Hi Eliot, Good :-) But the counter argument is that not all of the useful external packages But since it's trivial to fix these issues I'm not worried by this.
This example applies to strings, arrays and byte arrays: Object>>modifyLiteral | stream | stream := ReadWriteStream on: #[1 2 3 4] from: 1 to: 4. {stream contents. stream reset; nextPut: 255; setToEnd; contents} In the current system, the first time this is run it answers #(#[1 2 3 4] #[255 2 3 4]) and thereafter (or until it is recompiled) it answers #(#[255 2 3 4] #[255 2 3 4]). Ugh :-). With read-only literals it raises an error; the at:put: in nextPut: fails and raises ModificationForbidden. Note that such modification of literals is invisible unless one decompiles the method. The source of the method remains unchanged, cloaking the bug. Note that changing the example to Object>>modifyCopyOfLiteral | stream | stream := ReadWriteStream on: #[1 2 3 4] copy from: 1 to: 4. {stream contents. stream reset; nextPut: 255; setToEnd; contents} causes the method to behave unsurprisingly. Hence my claim that fixing these issues is trivial (deepCopy can be used to produce a mutable copy of a literal array of literals, for example). Dave _,,,^..^,,,_ best, Eliot |
In reply to this post by Eliot Miranda-2
On Sun, Mar 25, 2018 at 6:18 PM, Eliot Miranda <[hidden email]> wrote:
> Hi All, Hi Eliot, Congratulations on the enhancement. As developer of Magma, I'm interested in learning more about it as a possible alternative to the WriteBarrier. How does it work? First, it might help me to understand what _your_ primary motivation for proposing to make various entire classes of objects read-only by default? As you found from running the tests, the few cases that modified literals were doing it intentionally. My guess is that more read-only objects allows the VM to run a lot faster, is that right? But my hope was that this capability would be provided with granularity at the object-level, not necessarily the class level. > the VM now has support for read only objects, and the first logical > application is for literals, making boxed floats, strings, symbols, arrays > and byte arrays read-only. Wow, Arrays and ByteArrays too? Magma's core Buffer classes all use ByteArray's internally to represent and update their state. Ouch! > Doing so is trivial; only two methods need to be > modified (see attached). AFAIA little or no core code is broken by making > literals read-only. However, when I ran the test suite with read-only > literals in place, several tests were broken. These are things like > creating null-terminated strings for testing OSProcess, by replacing spaces > in string literals with zeros, etc. > > When we added read-only literals to VisualWorks in something like vw7.0 the > balance of opinion was that we should not break user code. Hence we > implemented a preference scheme with three states: > > - By default, an attempt to modify a read-only literal would silently make > the literal mutable temporarily,update it and then reenable read-onlyness. Magma uses previously-loaded ByteArray's and replaces their contents, field by field, via ByteArray>>#uint:at:put:. It runs this method more than any other in the image -- millions and millions of times. The performance of this method is very important for Magma. > - A second state would issue a warning to the transcript, while doing what > the default did. I'm reading that and imagining a low level primitive like Array>>#at:put: having a reference or signal to the Transcript that runs *every single time*? That can't be true. Surely the "preference scheme" you mentioned must've been more sophisticated than global Preference values. Something more dynamic and simple, like "myObject beReadOnly: true"...? Then applications -- including the IDE -- can handle AttemptToModifyReadOnlyObject as needed. > - The third state would raise an error In the process of raising that error, is there any danger of encountering a place in the debugger code which attempted to modify another Array, thus causing it to re-enter the original error-raising code to raise yet another error? It seems like a pretty low-level thing to have to guard against... > Remember that all one needs to do to fix code that modifies literals is to > send copy to the literal before modifying the copy, since copies of > read-only literals are mutable. I hope we are not confusing the intermal representation of objects being bytes with whether those objects are literals or not. If a program sees fit to modify a "literal", techincally is not a literal, but a variable, right? Dave's case sounds like a true literal, but I'm saying I have applications which update bitmap forms in real time. Artists paint on SketchMorphs.. I presume you don't mean for _these_ cases to need to make constant copies... > I was on the side of only raising an error, but was overruled :-). I wonder > what this community thinks. If there are strong views that user code should > continue to function unchanged (even though it is modifying literals, which > is so so wrong :-) ) I'm happy to implement a scheme similar to that for > VisualWorks. But I'd rather not have to and simply have people fix their > code within a system that raises an error whenever an attempt is made to > modify a literal, and is as simple as possible, but no simpler ;-) > Thoughts, opinions? Instead of introducing new assumptions and breaking old code, I suggest people writing which explicitly turn ON read-only'ness, for the objects they want preserving legacy expectations about Smalltalk as well as the operation of legacy applications. IMO, any introduction of immutability at the class-level should be turned off by default. Make it powerful but not painful. It would help tremendously to know what you want to accomplish and how the new API works. Regards, Chris |
In reply to this post by Eliot Miranda-2
> the VM now has support for read only objects, and the first logical
> application is for literals, making boxed floats, strings, symbols, > arrays and byte arrays read-only. Arrays and byte arrays? Do you mean we will not be able anymore to update the content of any array without copying it? It would be very, very wrong IMO. Avoiding unnecessary copying is of paramount importance for performance. BTW, sound is stored in byte arrays. I guess I am missing something here :) Stef |
In reply to this post by Chris Muller-3
> As you found from running the tests, the few cases that
> modified literals were doing it intentionally. My guess is that more > read-only objects allows the VM to run a lot faster, is that right? > But my hope was that this capability would be provided with > granularity at the object-level, not necessarily the class level. +1 > Wow, Arrays and ByteArrays too? Magma's core Buffer classes all use > ByteArray's internally to represent and update their state. Ouch! +1000 > Magma uses previously-loaded ByteArray's and replaces their contents, > field by field, via ByteArray>>#uint:at:put:. It runs this method > more than any other in the image -- millions and millions of times. > The performance of this method is very important for Magma. This definitely has to be kept possible. >> - A second state would issue a warning to the transcript, while doing what >> the default did. > > I'm reading that and imagining a low level primitive like > Array>>#at:put: having a reference or signal to the Transcript that > runs *every single time*? That can't be true. +100 > In the process of raising that error, is there any danger of > encountering a place in the debugger code which attempted to modify > another Array, thus causing it to re-enter the original error-raising > code to raise yet another error? It seems like a pretty low-level > thing to have to guard against... +1 > I hope we are not confusing the intermal representation of objects > being bytes with whether those objects are literals or not. If a > program sees fit to modify a "literal", techincally is not a literal, > but a variable, right? Dave's case sounds like a true literal, but > I'm saying I have applications which update bitmap forms in real time. > Artists paint on SketchMorphs.. I presume you don't mean for _these_ > cases to need to make constant copies... +1000 > Instead of introducing new assumptions and breaking old code, I > suggest people writing which explicitly turn ON read-only'ness, for > the objects they want preserving legacy expectations about Smalltalk > as well as the operation of legacy applications. IMO, any > introduction of immutability at the class-level should be turned off > by default. Make it powerful but not painful. +1 > It would help tremendously to know what you want to accomplish and how > the new API works. Indeed. Stef |
In reply to this post by Stéphane Rollandin
> I guess I am missing something here :)
Ah; I guess what I have been missing is how it works in VisualWorks: https://stackoverflow.com/questions/18994704/visualworks-cincom-smalltalk-array-initialization-differences Is that it? Stef |
In reply to this post by Chris Muller-3
On 26 March 2018 at 11:26, Chris Muller <[hidden email]> wrote: On Sun, Mar 25, 2018 at 6:18 PM, Eliot Miranda <[hidden email]> wrote: But my hope was that this capability would be provided with I would presume an entire class is not made read-only by default, just particular objects formed from literals at compile time. > the VM now has support for read only objects, and the first logical cheers -ben |
In reply to this post by Stéphane Rollandin
Since the Compiler will decide whether literals are to be made readOnly or not, it's there that the preference should act, if we ever need a preference. It happened to work, but it's not a supported feature and there never were any guaranty that such behavior would be preserved. For example, a Compiler might as well avoid duplicating literals and share them across CompiledMethod. An application relying massively on such unsupported feature could use own Compiler. Or we could introduce kind of compiler directives in annotations like Pharo did. But I wonder if the complication is worth. Aren't there many other ways to hold states? 2018-03-26 10:27 GMT+02:00 Stéphane Rollandin <[hidden email]>: As you found from running the tests, the few cases that |
No problem, it appears I did not understand what a literal is.
Sorry for the noise, Stef |
In reply to this post by David T. Lewis
On Sun, Mar 25, 2018 at 09:21:56PM -0400, David T. Lewis wrote:
> > Off topic for your question, but just for my own understanding - why is > it good for a ByteArray to be read-only? I would have thought that it > would be just an array of things that happen to be very small positive > integers (sorry I am sure this has been discussed before). > Ah, ok. I read the follow up discussion of St??phane Rollandin, I get it now :-) Dave |
In reply to this post by Eliot Miranda-2
On Sun, Mar 25, 2018 at 07:33:40PM -0700, Eliot Miranda wrote:
> Hi David, > > On Sun, Mar 25, 2018 at 6:21 PM, David T. Lewis <[hidden email]> wrote: > > > > > Off topic for your question, but just for my own understanding - why is > > it good for a ByteArray to be read-only? I would have thought that it > > would be just an array of things that happen to be very small positive > > integers (sorry I am sure this has been discussed before). > > > > This example applies to strings, arrays and byte arrays: > > Object>>modifyLiteral > | stream | > stream := ReadWriteStream on: #[1 2 3 4] from: 1 to: 4. > {stream contents. stream reset; nextPut: 255; setToEnd; contents} > > In the current system, the first time this is run it answers #(#[1 2 3 4] > #[255 2 3 4]) and thereafter (or until it is recompiled) it answers #(#[255 > 2 3 4] #[255 2 3 4]). Ugh :-). With read-only literals it raises an > error; the at:put: in nextPut: fails and raises ModificationForbidden. > Note that such modification of literals is invisible unless one decompiles > the method. The source of the method remains unchanged, cloaking the bug. > > Note that changing the example to > > Object>>modifyCopyOfLiteral > | stream | > stream := ReadWriteStream on: #[1 2 3 4] copy from: 1 to: 4. > {stream contents. stream reset; nextPut: 255; setToEnd; contents} > > causes the method to behave unsurprisingly. Hence my claim that fixing > these issues is trivial (deepCopy can be used to produce a mutable copy of > a literal array of literals, for example). > Thanks, I understand now. I am fairly sure that I've been bitten by this before, so I am happy to see it become less "surprising". Dave |
In reply to this post by Eliot Miranda-2
+1 for read-only literals in general. Trying to modify a literal should raise a Deprecation Warning for at least one release cycle, with an option to silence it. In some future release we can make it an error. - Bert - |
> +1 for read-only literals in general.
> > Trying to modify a literal should raise a Deprecation Warning for at > least one release cycle, with an option to silence it. > > In some future release we can make it an error. > > - Bert - +1 Stef |
Le lun. 26 mars 2018 à 14:46, Stéphane Rollandin <[hidden email]> a écrit : > +1 for read-only literals in general. +1 too, it sounds very reasonable. |
In reply to this post by Chris Muller-3
Hi Chris,
- an object-to-database mapping system -debugging etc
|
In reply to this post by Stéphane Rollandin
On Mar 26, 2018, at 1:33 AM, Stéphane Rollandin <[hidden email]> wrote: >> I guess I am missing something here :) > > Ah; I guess what I have been missing is how it works in VisualWorks: > https://stackoverflow.com/questions/18994704/visualworks-cincom-smalltalk-array-initialization-differences > > Is that it? Yes. Although our setter/getter and exceptions are different. But the base mechanism is effectively identical. There is a per-object read-only bit that prevents modification that is off by default and will be set for literals. > > Stef > |
In reply to this post by Eliot Miranda-2
(Palm hits forehead "duh, of course!"). You meant only literal Arrays
and ByteArrays in CompiledMethods, not _all_ Arrays and ByteArray instances. Whew, sorry! :) Thanks for your patience with me. On Mon, Mar 26, 2018 at 9:09 AM, Eliot Miranda <[hidden email]> wrote: > Hi Chris, > > > > _,,,^..^,,,_ (phone) > On Mar 25, 2018, at 8:26 PM, Chris Muller <[hidden email]> wrote: > > On Sun, Mar 25, 2018 at 6:18 PM, Eliot Miranda <[hidden email]> > wrote: > > Hi All, > > > Hi Eliot, > > Congratulations on the enhancement. As developer of Magma, I'm > interested in learning more about it as a possible alternative to the > WriteBarrier. How does it work? > > > There is a per-object flag (a bit in the object header) that if set prevents > modification. The bit is controlled by Object>>#setIsReadOnlyObject: with > Object>>#beReadOnlyObject as a convenience. Object>>#beReadOnlyObject is > the getter. By default the bit is off (new objects are mutable). The > Compiler modification sets the bit for method literals (literal strings, > symbols, arrays, byte arrays and boxed floats). The facility may also be > used for > - an object-to-database mapping system > -debugging > etc > > > First, it might help me to understand what _your_ primary motivation > for proposing to make various entire classes of objects read-only by > default? > > > I have proposed no such thing. You have misunderstood :-) > > As you found from running the tests, the few cases that > modified literals were doing it intentionally. My guess is that more > read-only objects allows the VM to run a lot faster, is that right? > But my hope was that this capability would be provided with > granularity at the object-level, not necessarily the class level. > > the VM now has support for read only objects, and the first logical > > application is for literals, making boxed floats, strings, symbols, arrays > > and byte arrays read-only. > > > Wow, Arrays and ByteArrays too? Magma's core Buffer classes all use > ByteArray's internally to represent and update their state. Ouch! > > > No, no, no, no :-). Only _literal_ instances of these. > > > Doing so is trivial; only two methods need to be > > modified (see attached). AFAIA little or no core code is broken by making > > literals read-only. However, when I ran the test suite with read-only > > literals in place, several tests were broken. These are things like > > creating null-terminated strings for testing OSProcess, by replacing spaces > > in string literals with zeros, etc. > > > When we added read-only literals to VisualWorks in something like vw7.0 the > > balance of opinion was that we should not break user code. Hence we > > implemented a preference scheme with three states: > > > - By default, an attempt to modify a read-only literal would silently make > > the literal mutable temporarily,update it and then reenable read-onlyness. > > > Magma uses previously-loaded ByteArray's and replaces their contents, > field by field, via ByteArray>>#uint:at:put:. It runs this method > more than any other in the image -- millions and millions of times. > The performance of this method is very important for Magma. > > > And why on earth would I try and introduce something that would break this? > > > - A second state would issue a warning to the transcript, while doing what > > the default did. > > > I'm reading that and imagining a low level primitive like > Array>>#at:put: having a reference or signal to the Transcript that > runs *every single time*? That can't be true. Surely the "preference > scheme" you mentioned must've been more sophisticated than global > Preference values. Something more dynamic and simple, like "myObject > beReadOnly: true"...? Then applications -- including the IDE -- can > handle AttemptToModifyReadOnlyObject as needed. > > - The third state would raise an error > > > In the process of raising that error, is there any danger of > encountering a place in the debugger code which attempted to modify > another Array, thus causing it to re-enter the original error-raising > code to raise yet another error? It seems like a pretty low-level > thing to have to guard against... > > Remember that all one needs to do to fix code that modifies literals is to > > send copy to the literal before modifying the copy, since copies of > > read-only literals are mutable. > > > I hope we are not confusing the intermal representation of objects > being bytes with whether those objects are literals or not. If a > program sees fit to modify a "literal", techincally is not a literal, > but a variable, right? Dave's case sounds like a true literal, but > I'm saying I have applications which update bitmap forms in real time. > Artists paint on SketchMorphs.. I presume you don't mean for _these_ > cases to need to make constant copies... > > I was on the side of only raising an error, but was overruled :-). I wonder > > what this community thinks. If there are strong views that user code should > > continue to function unchanged (even though it is modifying literals, which > > is so so wrong :-) ) I'm happy to implement a scheme similar to that for > > VisualWorks. But I'd rather not have to and simply have people fix their > > code within a system that raises an error whenever an attempt is made to > > modify a literal, and is as simple as possible, but no simpler ;-) > > Thoughts, opinions? > > > Instead of introducing new assumptions and breaking old code, I > suggest people writing which explicitly turn ON read-only'ness, for > the objects they want preserving legacy expectations about Smalltalk > as well as the operation of legacy applications. IMO, any > introduction of immutability at the class-level should be turned off > by default. Make it powerful but not painful. > > It would help tremendously to know what you want to accomplish and how > the new API works. > > > We have been discussing this for a while now. Clément even wrote a paper > > https://hal.archives-ouvertes.fr/hal-01356338/document > > > Regards, > Chris > |
Free forum by Nabble | Edit this page |