Question about inlining | How to access named temps in FullBlockClosure?

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
25 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: CurrenReadOnlySourceFiles (was: Re: Question about inlining | How to access named temps in FullBlockClosure?)

Levente Uzonyi
Hi Eliot,

On Tue, 31 Mar 2020, Eliot Miranda wrote:

> Hi Levente,
>
> On Tue, Mar 31, 2020 at 3:33 PM Levente Uzonyi <[hidden email]> wrote:
>       Hi Eliot,
>
>       On Mon, 30 Mar 2020, Eliot Miranda wrote:
>
>       > Hi Levente,
>       >
>       >> On Mar 30, 2020, at 2:21 PM, Levente Uzonyi <[hidden email]> wrote:
>       >>
>       >> Hi Eliot,
>       >>
>       >>> On Mon, 30 Mar 2020, Eliot Miranda wrote:
>       >>>
>       >>> Well, that's not what I meant by a search.  However, as Levente pointed out, textual searches should be surrounded with CurrentReadOlySouceFiles cacheDuring:.  I think this is an awful implementation and would
>       implement it
>       >>> very differently but that's the work-around we have in place now,
>       >>
>       >> How would you implement it?
>       >>
>       >> <history>
>       >> When I introduced CurrentReadOnlySourceFiles, I wanted to solve the issue of concurrent access to the source files.
>       >> I had the following options:
>       >> 1) controlled access to a shared resource (a single read-only copy of the SourceFiles array) with e.g. a Monitor
>       >> 2) same as 1) but with multiple copies pooled
>       >> 3) exceptions to define the scope and lifetime of the resources (the read-only copies) within a process
>       >>
>       >> I chose the third option because it was possible to introduce it without exploring and rewriting existing users: you could leave all code as-is
>       >> and sprinke CurrentReadOnlySourceFiles cacheDuring: [ ... ] around code that needed better performance.
>       >> It's obviously not a perfect solution, but I still think it was the best available at the time.
>       >>
>       >> Later ProcessLocalVariables were added to Trunk. Which could be used to solve the concurrent access issue by using process-local copies of the source files. The only challenge is to release them after they are
>       not needed any more. Perhaps a high priority process could do that after a few minutes of inactivity. Or we could just let them linger and see if they cause any problems.
>       >> </history>
>       >
>       > I think the key issue (& this from a discussion here with Bert) is access time source in the debugger while one is debugging file access.  As the debugger asks for source so the file pointer is moved and hence
>       screws up the access one is trying to debug.
>
>       I don't think that's the only issue. Have a look at the senders of
>       #readOnlyCopy. Many of them were added 10+ years ago, well before
>       CurrentReadOnlySourceFiles was introduced. Most of those could use
>       CurrentReadOnlySourceFiles too but are unrelated to the debugger.
>
>
> Yes, but IIRC that issue was to separate the writable file from the read-only file.  I remember dealing with this when working on Newspeak in 2007/2008. So SourceFiles can easily maintain a writable file and a read-only copy
> of the file for both sources and changes and do writes through the writable one.
>
>
>       >
>       > So I would provide something like
>       >   SourceFiles withSubstituteCopiesWhile: aBlock
>       > which would install either copies of the files or read-only copies of the files for the duration of the block, and have the debugger use the method around its access to method source.
>       >
>       > The IDE is essentially single threaded as far as code modification goes, even if this isn’t enforced. There is no serialization on adding/removing methods and concurrent access can corrupt method dictionaries,
>       and that limitation is fine in practice.  So having the same restriction on source file access is fine too (and in fact I think the restriction already applies; if one were to fork compiles then source update to
>       the changes file could get corrupted too).
>       >
>       > So I think not using read-only copies to read source, and having the debugger use copies for its access would be a good lightweight solution.
>
>       I agree with what you wrote about method changes, but reading the sources
>       concurrently is still a possibility, especially when multiple UI processes
>       can exist at the same time (e.g. that's what opening a debugger does,
>       right?).
>
>
> My assertion is that the IDE is essentially single0-threaded and this doesn't;t have to be supported.  In any case, concurrent access will work if processes of the same priority level are cooperating.  But I just answered the
> debugger issue.  I'm assuming that the debugger guards all its source access by substituting a different file.  So it, and only it, accesses the sources files through copies, and it, and only it pays the cost for substituting
> the copies.  Normal queries can use a single read-only copy.  That gives us the functionality of cacheDuring: without having to invoke it.
The IDE is single-threaded but source files may be read outside the
context of the IDE.

>
>
> So let me reiterate.
>
> SourceFiles is modified to have a single writable version of the changes file and a single read-only version of sources nd changes files.  Source code is read through the readable copy and new source written through the
> writable copy.  Whenever the debugger accesses source it does so through a method that first saves the files, substitutes copies in SourcesFiles, evaluates its block (that will access source through the copies), and then
> ensures that the original files are restored.  There can be error checking for writing to the changes file in the debugger while writes are in progress to the original writable changes file, although I'm not sure this is
> necessary; folks debugging source file access usually know what they're doing.
>
> The result is that
> - normal source reading does not require creating a read-only copy; it already exists.
Do you mean that the existence of #readOnlyCopy is satisfactory?.
Creating a copy for every single file access is painfully slow.
CurrentReadOnlySourceFiles only exists to remedy that by reusing the same
read-only copy.

> - the debugger does not interfere with source access because it is careful to use copies and leave the originals undisturbed

That's exactly what I tried to imply by stating there being no problem
with the debugger before the mass use of read-only copies were introduced.


Levente

> - CurrentReadOnlySourceFiles and cacheDuring: can be discarded
>
>
>
>
>
>       Levente
>
>       >
>       >> Levente
>       >
>       > Eliot
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>

Reply | Threaded
Open this post in threaded view
|

Re: CurrenReadOnlySourceFiles (was: Re: Question about inlining | How to access named temps in FullBlockClosure?)

Eliot Miranda-2
Hi Levente,

On Tue, Mar 31, 2020 at 7:38 PM Levente Uzonyi <[hidden email]> wrote:
Hi Eliot,

On Tue, 31 Mar 2020, Eliot Miranda wrote:

> Hi Levente,
>
> On Tue, Mar 31, 2020 at 3:33 PM Levente Uzonyi <[hidden email]> wrote:
>       Hi Eliot,
>
>       On Mon, 30 Mar 2020, Eliot Miranda wrote:
>
>       > Hi Levente,
>       >
>       >> On Mar 30, 2020, at 2:21 PM, Levente Uzonyi <[hidden email]> wrote:
>       >>
>       >> Hi Eliot,
>       >>
>       >>> On Mon, 30 Mar 2020, Eliot Miranda wrote:
>       >>>
>       >>> Well, that's not what I meant by a search.  However, as Levente pointed out, textual searches should be surrounded with CurrentReadOlySouceFiles cacheDuring:.  I think this is an awful implementation and would
>       implement it
>       >>> very differently but that's the work-around we have in place now,
>       >>
>       >> How would you implement it?
>       >>
>       >> <history>
>       >> When I introduced CurrentReadOnlySourceFiles, I wanted to solve the issue of concurrent access to the source files.
>       >> I had the following options:
>       >> 1) controlled access to a shared resource (a single read-only copy of the SourceFiles array) with e.g. a Monitor
>       >> 2) same as 1) but with multiple copies pooled
>       >> 3) exceptions to define the scope and lifetime of the resources (the read-only copies) within a process
>       >>
>       >> I chose the third option because it was possible to introduce it without exploring and rewriting existing users: you could leave all code as-is
>       >> and sprinke CurrentReadOnlySourceFiles cacheDuring: [ ... ] around code that needed better performance.
>       >> It's obviously not a perfect solution, but I still think it was the best available at the time.
>       >>
>       >> Later ProcessLocalVariables were added to Trunk. Which could be used to solve the concurrent access issue by using process-local copies of the source files. The only challenge is to release them after they are
>       not needed any more. Perhaps a high priority process could do that after a few minutes of inactivity. Or we could just let them linger and see if they cause any problems.
>       >> </history>
>       >
>       > I think the key issue (& this from a discussion here with Bert) is access time source in the debugger while one is debugging file access.  As the debugger asks for source so the file pointer is moved and hence
>       screws up the access one is trying to debug.
>
>       I don't think that's the only issue. Have a look at the senders of
>       #readOnlyCopy. Many of them were added 10+ years ago, well before
>       CurrentReadOnlySourceFiles was introduced. Most of those could use
>       CurrentReadOnlySourceFiles too but are unrelated to the debugger.
>
>
> Yes, but IIRC that issue was to separate the writable file from the read-only file.  I remember dealing with this when working on Newspeak in 2007/2008. So SourceFiles can easily maintain a writable file and a read-only copy
> of the file for both sources and changes and do writes through the writable one.
>
>
>       >
>       > So I would provide something like
>       >   SourceFiles withSubstituteCopiesWhile: aBlock
>       > which would install either copies of the files or read-only copies of the files for the duration of the block, and have the debugger use the method around its access to method source.
>       >
>       > The IDE is essentially single threaded as far as code modification goes, even if this isn’t enforced. There is no serialization on adding/removing methods and concurrent access can corrupt method dictionaries,
>       and that limitation is fine in practice.  So having the same restriction on source file access is fine too (and in fact I think the restriction already applies; if one were to fork compiles then source update to
>       the changes file could get corrupted too).
>       >
>       > So I think not using read-only copies to read source, and having the debugger use copies for its access would be a good lightweight solution.
>
>       I agree with what you wrote about method changes, but reading the sources
>       concurrently is still a possibility, especially when multiple UI processes
>       can exist at the same time (e.g. that's what opening a debugger does,
>       right?).
>
>
> My assertion is that the IDE is essentially single0-threaded and this doesn't;t have to be supported.  In any case, concurrent access will work if processes of the same priority level are cooperating.  But I just answered the
> debugger issue.  I'm assuming that the debugger guards all its source access by substituting a different file.  So it, and only it, accesses the sources files through copies, and it, and only it pays the cost for substituting
> the copies.  Normal queries can use a single read-only copy.  That gives us the functionality of cacheDuring: without having to invoke it.

The IDE is single-threaded but source files may be read outside the
context of the IDE.

Can you give me a for instance.  I simply don't believe you.  And even its it's true I don't see that it has to be supported.  Please don't be vague.  This is important.
 
>
>
> So let me reiterate.
>
> SourceFiles is modified to have a single writable version of the changes file and a single read-only version of sources nd changes files.  Source code is read through the readable copy and new source written through the
> writable copy.  Whenever the debugger accesses source it does so through a method that first saves the files, substitutes copies in SourcesFiles, evaluates its block (that will access source through the copies), and then
> ensures that the original files are restored.  There can be error checking for writing to the changes file in the debugger while writes are in progress to the original writable changes file, although I'm not sure this is
> necessary; folks debugging source file access usually know what they're doing.
>
> The result is that
> - normal source reading does not require creating a read-only copy; it already exists.

Do you mean that the existence of #readOnlyCopy is satisfactory?.

Yes.
 
Creating a copy for every single file access is painfully slow.

Exactly.
 
CurrentReadOnlySourceFiles only exists to remedy that by reusing the same
read-only copy.

I feel like you're not understanding my proposal.  Apologies if I'm presuming.  With my proposal the only time a new copy is created is when the debugger wants to access the source of a method.  That happens on the order of seconds, not microseconds as happens when scanning for source.


> - the debugger does not interfere with source access because it is careful to use copies and leave the originals undisturbed

That's exactly what I tried to imply by stating there being no problem
with the debugger before the mass use of read-only copies were introduced.

Can you not see that in any scheme there is the potential for chaos if the debugger is accessing the source as one steps through code of methods that themselves are in the process of accessing source?  And so it is key that the debugger not* perturb the system when it itself accesses source?

I'm confused.  We seem to be talking past each other.  I feel like you're blocking a reasonable proposal but I don't really understand what your objections are.  I apologize.  I'm not trying to be confrontational, but I do think my proposal is important and has merit and I feel frustrated by you because I can't quite understand why you're against it.  If you can identify a serious flaw I'll happily abandon it.  But I need to understand the flaw first.



Levente

> - CurrentReadOnlySourceFiles and cacheDuring: can be discarded
>
>
>
>
>
>       Levente
>
>       >
>       >> Levente
>       >
>       > Eliot
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>


--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: CurrenReadOnlySourceFiles (was: Re: Question about inlining | How to access named temps in FullBlockClosure?)

Levente Uzonyi
Hi Eliot,

On Tue, 31 Mar 2020, Eliot Miranda wrote:

> Hi Levente,
>
> On Tue, Mar 31, 2020 at 7:38 PM Levente Uzonyi <[hidden email]> wrote:
>       Hi Eliot,
>
>       On Tue, 31 Mar 2020, Eliot Miranda wrote:
>
>       > Hi Levente,
>       >
>       > On Tue, Mar 31, 2020 at 3:33 PM Levente Uzonyi <[hidden email]> wrote:
>       >       Hi Eliot,
>       >
>       >       On Mon, 30 Mar 2020, Eliot Miranda wrote:
>       >
>       >       > Hi Levente,
>       >       >
>       >       >> On Mar 30, 2020, at 2:21 PM, Levente Uzonyi <[hidden email]> wrote:
>       >       >>
>       >       >> Hi Eliot,
>       >       >>
>       >       >>> On Mon, 30 Mar 2020, Eliot Miranda wrote:
>       >       >>>
>       >       >>> Well, that's not what I meant by a search.  However, as Levente pointed out, textual searches should be surrounded with CurrentReadOlySouceFiles cacheDuring:.  I think this is an awful implementation
>       and would
>       >       implement it
>       >       >>> very differently but that's the work-around we have in place now,
>       >       >>
>       >       >> How would you implement it?
>       >       >>
>       >       >> <history>
>       >       >> When I introduced CurrentReadOnlySourceFiles, I wanted to solve the issue of concurrent access to the source files.
>       >       >> I had the following options:
>       >       >> 1) controlled access to a shared resource (a single read-only copy of the SourceFiles array) with e.g. a Monitor
>       >       >> 2) same as 1) but with multiple copies pooled
>       >       >> 3) exceptions to define the scope and lifetime of the resources (the read-only copies) within a process
>       >       >>
>       >       >> I chose the third option because it was possible to introduce it without exploring and rewriting existing users: you could leave all code as-is
>       >       >> and sprinke CurrentReadOnlySourceFiles cacheDuring: [ ... ] around code that needed better performance.
>       >       >> It's obviously not a perfect solution, but I still think it was the best available at the time.
>       >       >>
>       >       >> Later ProcessLocalVariables were added to Trunk. Which could be used to solve the concurrent access issue by using process-local copies of the source files. The only challenge is to release them after
>       they are
>       >       not needed any more. Perhaps a high priority process could do that after a few minutes of inactivity. Or we could just let them linger and see if they cause any problems.
>       >       >> </history>
>       >       >
>       >       > I think the key issue (& this from a discussion here with Bert) is access time source in the debugger while one is debugging file access.  As the debugger asks for source so the file pointer is moved and
>       hence
>       >       screws up the access one is trying to debug.
>       >
>       >       I don't think that's the only issue. Have a look at the senders of
>       >       #readOnlyCopy. Many of them were added 10+ years ago, well before
>       >       CurrentReadOnlySourceFiles was introduced. Most of those could use
>       >       CurrentReadOnlySourceFiles too but are unrelated to the debugger.
>       >
>       >
>       > Yes, but IIRC that issue was to separate the writable file from the read-only file.  I remember dealing with this when working on Newspeak in 2007/2008. So SourceFiles can easily maintain a writable file and a
>       read-only copy
>       > of the file for both sources and changes and do writes through the writable one.
>       >
>       >
>       >       >
>       >       > So I would provide something like
>       >       >   SourceFiles withSubstituteCopiesWhile: aBlock
>       >       > which would install either copies of the files or read-only copies of the files for the duration of the block, and have the debugger use the method around its access to method source.
>       >       >
>       >       > The IDE is essentially single threaded as far as code modification goes, even if this isn’t enforced. There is no serialization on adding/removing methods and concurrent access can corrupt method
>       dictionaries,
>       >       and that limitation is fine in practice.  So having the same restriction on source file access is fine too (and in fact I think the restriction already applies; if one were to fork compiles then source
>       update to
>       >       the changes file could get corrupted too).
>       >       >
>       >       > So I think not using read-only copies to read source, and having the debugger use copies for its access would be a good lightweight solution.
>       >
>       >       I agree with what you wrote about method changes, but reading the sources
>       >       concurrently is still a possibility, especially when multiple UI processes
>       >       can exist at the same time (e.g. that's what opening a debugger does,
>       >       right?).
>       >
>       >
>       > My assertion is that the IDE is essentially single0-threaded and this doesn't;t have to be supported.  In any case, concurrent access will work if processes of the same priority level are cooperating.  But I
>       just answered the
>       > debugger issue.  I'm assuming that the debugger guards all its source access by substituting a different file.  So it, and only it, accesses the sources files through copies, and it, and only it pays the cost
>       for substituting
>       > the copies.  Normal queries can use a single read-only copy.  That gives us the functionality of cacheDuring: without having to invoke it.
>
>       The IDE is single-threaded but source files may be read outside the
>       context of the IDE.
>
>
> Can you give me a for instance.  I simply don't believe you.  And even its it's true I don't see that it has to be supported.  Please don't be vague.  This is important.
For example, Seaside has a web-based code browser. The webserver, no
matter which one is used by Seaside, will read the code from a process
different than the UI process.

>  
>       >
>       >
>       > So let me reiterate.
>       >
>       > SourceFiles is modified to have a single writable version of the changes file and a single read-only version of sources nd changes files.  Source code is read through the readable copy and new source written
>       through the
>       > writable copy.  Whenever the debugger accesses source it does so through a method that first saves the files, substitutes copies in SourcesFiles, evaluates its block (that will access source through the copies),
>       and then
>       > ensures that the original files are restored.  There can be error checking for writing to the changes file in the debugger while writes are in progress to the original writable changes file, although I'm not
>       sure this is
>       > necessary; folks debugging source file access usually know what they're doing.
>       >
>       > The result is that
>       > - normal source reading does not require creating a read-only copy; it already exists.
>
>       Do you mean that the existence of #readOnlyCopy is satisfactory?.
>
>
> Yes.
>  
>       Creating a copy for every single file access is painfully slow.
>
>
> Exactly.
>  
>       CurrentReadOnlySourceFiles only exists to remedy that by reusing the same
>       read-only copy.
>
>
> I feel like you're not understanding my proposal.  Apologies if I'm presuming.  With my proposal the only time a new copy is created is when the debugger wants to access the source of a method.  That happens on the order of
> seconds, not microseconds as happens when scanning for source.
You say that each tool should use the same shared file streams. If so,
that implies that all sends of #readOnlyCopy have unnecessarily been added
over the years except for those which are related to the debugger.
And if I were to remove them along with CurrentReadOnlySourceFiles,
everything would stay normal, right?

>
>
>       > - the debugger does not interfere with source access because it is careful to use copies and leave the originals undisturbed
>
>       That's exactly what I tried to imply by stating there being no problem
>       with the debugger before the mass use of read-only copies were introduced.
>
>
> Can you not see that in any scheme there is the potential for chaos if the debugger is accessing the source as one steps through code of methods that themselves are in the process of accessing source?  And so it is key that
> the debugger not* perturb the system when it itself accesses source?
>
> I'm confused.  We seem to be talking past each other.  I feel like you're blocking a reasonable proposal but I don't really understand what your objections are.  I apologize.  I'm not trying to be confrontational, but I do
> think my proposal is important and has merit and I feel frustrated by you because I can't quite understand why you're against it.  If you can identify a serious flaw I'll happily abandon it.  But I need to understand the flaw
> first.
I think that we have different ideas about what the problem is:

1) You say that the debugger is a key source of problems right now.
I say that I'm not aware of issues with the debugger. I know what the
potential problem could be, but I don't know whether it exists or not
right now because I have not seen any issues with the debugger lately.
If you have a reproducible case, please share it here.

2) You say that the source files can safely be treated as
only-to-be-used-by-the-IDE. I say that's not the case. IMO, I should
be able to fork a process scanning the sources and do something else while
it's processing the code. I also think that external tools like Seaside
should be able to read the source files without messing up the image.

Did I understand you correctly based on these two points above?


Levente

>
>
>
>       Levente
>
>       > - CurrentReadOnlySourceFiles and cacheDuring: can be discarded
>       >
>       >
>       >
>       >
>       >
>       >       Levente
>       >
>       >       >
>       >       >> Levente
>       >       >
>       >       > Eliot
>       >
>       >
>       >
>       > --
>       > _,,,^..^,,,_
>       > best, Eliot
>       >
>       >
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>

Reply | Threaded
Open this post in threaded view
|

Re: CurrenReadOnlySourceFiles (was: Re: Question about inlining | How to access named temps in FullBlockClosure?)

Eliot Miranda-2
Hi Levente,

On Tue, Mar 31, 2020 at 9:25 PM Levente Uzonyi <[hidden email]> wrote:
Hi Eliot,

On Tue, 31 Mar 2020, Eliot Miranda wrote:

> Hi Levente,
>
> On Tue, Mar 31, 2020 at 7:38 PM Levente Uzonyi <[hidden email]> wrote:
>       Hi Eliot,
>
>       On Tue, 31 Mar 2020, Eliot Miranda wrote:
>
>       > Hi Levente,
>       >
>       > On Tue, Mar 31, 2020 at 3:33 PM Levente Uzonyi <[hidden email]> wrote:
>       >       Hi Eliot,
>       >
>       >       On Mon, 30 Mar 2020, Eliot Miranda wrote:
>       >
>       >       > Hi Levente,
>       >       >
>       >       >> On Mar 30, 2020, at 2:21 PM, Levente Uzonyi <[hidden email]> wrote:
>       >       >>
>       >       >> Hi Eliot,
>       >       >>
>       >       >>> On Mon, 30 Mar 2020, Eliot Miranda wrote:
>       >       >>>
>       >       >>> Well, that's not what I meant by a search.  However, as Levente pointed out, textual searches should be surrounded with CurrentReadOlySouceFiles cacheDuring:.  I think this is an awful implementation
>       and would
>       >       implement it
>       >       >>> very differently but that's the work-around we have in place now,
>       >       >>
>       >       >> How would you implement it?
>       >       >>
>       >       >> <history>
>       >       >> When I introduced CurrentReadOnlySourceFiles, I wanted to solve the issue of concurrent access to the source files.
>       >       >> I had the following options:
>       >       >> 1) controlled access to a shared resource (a single read-only copy of the SourceFiles array) with e.g. a Monitor
>       >       >> 2) same as 1) but with multiple copies pooled
>       >       >> 3) exceptions to define the scope and lifetime of the resources (the read-only copies) within a process
>       >       >>
>       >       >> I chose the third option because it was possible to introduce it without exploring and rewriting existing users: you could leave all code as-is
>       >       >> and sprinke CurrentReadOnlySourceFiles cacheDuring: [ ... ] around code that needed better performance.
>       >       >> It's obviously not a perfect solution, but I still think it was the best available at the time.
>       >       >>
>       >       >> Later ProcessLocalVariables were added to Trunk. Which could be used to solve the concurrent access issue by using process-local copies of the source files. The only challenge is to release them after
>       they are
>       >       not needed any more. Perhaps a high priority process could do that after a few minutes of inactivity. Or we could just let them linger and see if they cause any problems.
>       >       >> </history>
>       >       >
>       >       > I think the key issue (& this from a discussion here with Bert) is access time source in the debugger while one is debugging file access.  As the debugger asks for source so the file pointer is moved and
>       hence
>       >       screws up the access one is trying to debug.
>       >
>       >       I don't think that's the only issue. Have a look at the senders of
>       >       #readOnlyCopy. Many of them were added 10+ years ago, well before
>       >       CurrentReadOnlySourceFiles was introduced. Most of those could use
>       >       CurrentReadOnlySourceFiles too but are unrelated to the debugger.
>       >
>       >
>       > Yes, but IIRC that issue was to separate the writable file from the read-only file.  I remember dealing with this when working on Newspeak in 2007/2008. So SourceFiles can easily maintain a writable file and a
>       read-only copy
>       > of the file for both sources and changes and do writes through the writable one.
>       >
>       >
>       >       >
>       >       > So I would provide something like
>       >       >   SourceFiles withSubstituteCopiesWhile: aBlock
>       >       > which would install either copies of the files or read-only copies of the files for the duration of the block, and have the debugger use the method around its access to method source.
>       >       >
>       >       > The IDE is essentially single threaded as far as code modification goes, even if this isn’t enforced. There is no serialization on adding/removing methods and concurrent access can corrupt method
>       dictionaries,
>       >       and that limitation is fine in practice.  So having the same restriction on source file access is fine too (and in fact I think the restriction already applies; if one were to fork compiles then source
>       update to
>       >       the changes file could get corrupted too).
>       >       >
>       >       > So I think not using read-only copies to read source, and having the debugger use copies for its access would be a good lightweight solution.
>       >
>       >       I agree with what you wrote about method changes, but reading the sources
>       >       concurrently is still a possibility, especially when multiple UI processes
>       >       can exist at the same time (e.g. that's what opening a debugger does,
>       >       right?).
>       >
>       >
>       > My assertion is that the IDE is essentially single0-threaded and this doesn't;t have to be supported.  In any case, concurrent access will work if processes of the same priority level are cooperating.  But I
>       just answered the
>       > debugger issue.  I'm assuming that the debugger guards all its source access by substituting a different file.  So it, and only it, accesses the sources files through copies, and it, and only it pays the cost
>       for substituting
>       > the copies.  Normal queries can use a single read-only copy.  That gives us the functionality of cacheDuring: without having to invoke it.
>
>       The IDE is single-threaded but source files may be read outside the
>       context of the IDE.
>
>
> Can you give me a for instance.  I simply don't believe you.  And even its it's true I don't see that it has to be supported.  Please don't be vague.  This is important.

For example, Seaside has a web-based code browser. The webserver, no
matter which one is used by Seaside, will read the code from a process
different than the UI process.

Yes, but there's still no implication that source access should be thread-safe.  The Seaside access to the IDE is still happening in the context of a cooperatively threaded Smalltalk, and there are places in Seaside where access to the IDE could be serialized without relying on support for thread safe source access when there is no thread-safe access to adding/removing methods.  So for the Seaside browser to function properly synchronization needs to be added to the general interface between Seaside and the IDE, not just source access.  For example, if we had a Seaside Squeak IDE server that allowed sharing between multiple programmers I suggest that the right way to serialize access is to provide some kind of synchronized queue between Seaside and the IDE, not to try and make the IDE thread-safe.  Updating things like class definitions, which potentially imply recompiling all methods in a class hierarchy require that no other modifications to the class hierarchy are occrring while a class and its subclasses are being redefined.


>       > So let me reiterate.
>       >
>       > SourceFiles is modified to have a single writable version of the changes file and a single read-only version of sources nd changes files.  Source code is read through the readable copy and new source written
>       through the
>       > writable copy.  Whenever the debugger accesses source it does so through a method that first saves the files, substitutes copies in SourcesFiles, evaluates its block (that will access source through the copies),
>       and then
>       > ensures that the original files are restored.  There can be error checking for writing to the changes file in the debugger while writes are in progress to the original writable changes file, although I'm not
>       sure this is
>       > necessary; folks debugging source file access usually know what they're doing.
>       >
>       > The result is that
>       > - normal source reading does not require creating a read-only copy; it already exists.
>
>       Do you mean that the existence of #readOnlyCopy is satisfactory?.
>
>
> Yes.
>  
>       Creating a copy for every single file access is painfully slow.
>
>
> Exactly.
>  
>       CurrentReadOnlySourceFiles only exists to remedy that by reusing the same
>       read-only copy.
>
>
> I feel like you're not understanding my proposal.  Apologies if I'm presuming.  With my proposal the only time a new copy is created is when the debugger wants to access the source of a method.  That happens on the order of
> seconds, not microseconds as happens when scanning for source.

You say that each tool should use the same shared file streams. If so,
that implies that all sends of #readOnlyCopy have unnecessarily been added
over the years except for those which are related to the debugger.

That's my opinion yes.
 
And if I were to remove them along with CurrentReadOnlySourceFiles,
everything would stay normal, right?

I think so.  Provided the debugger accesses source carefully, everything should be OK.  It's certainly worth an experiment, right?

 
>       > - the debugger does not interfere with source access because it is careful to use copies and leave the originals undisturbed
>
>       That's exactly what I tried to imply by stating there being no problem
>       with the debugger before the mass use of read-only copies were introduced.
>
>
> Can you not see that in any scheme there is the potential for chaos if the debugger is accessing the source as one steps through code of methods that themselves are in the process of accessing source?  And so it is key that
> the debugger not* perturb the system when it itself accesses source?
>
> I'm confused.  We seem to be talking past each other.  I feel like you're blocking a reasonable proposal but I don't really understand what your objections are.  I apologize.  I'm not trying to be confrontational, but I do
> think my proposal is important and has merit and I feel frustrated by you because I can't quite understand why you're against it.  If you can identify a serious flaw I'll happily abandon it.  But I need to understand the flaw
> first.

I think that we have different ideas about what the problem is:

1) You say that the debugger is a key source of problems right now.
I say that I'm not aware of issues with the debugger. I know what the
potential problem could be, but I don't know whether it exists or not
right now because I have not seen any issues with the debugger lately.
If you have a reproducible case, please share it here.

No I'm not saying that the debugger is a source of problems.  I'm saying that if one wants to debug source access then w.r.t. source access the debugger must not interfere.  Since the debugger necessarily accesses source as it is used, it must not perturb source access while it is being used to debug source access.

Let me be clear.  Let's say we want to step through

      (Object >> #at:) timeStamp

as we stepped through the debugger would access several methods in CompiledMethod and FileStream.  If we only had one read-only copy for the source file then the potential exists for the source pointer of the read-only copy to be changed after it has been set to point at the chunk for Object>>#at:, and hence get the wrong answer, maybe answering the timeStamp for CompiledMethod>>preamble or some such.  So every time the debugger accesses source it must be careful not to disturb the single read-only copy of the source file.  It can easily do this by using a private copy of the source file.

2) You say that the source files can safely be treated as
only-to-be-used-by-the-IDE. I say that's not the case. IMO, I should
be able to fork a process scanning the sources and do something else while
it's processing the code. I also think that external tools like Seaside
should be able to read the source files without messing up the image.

Have you ever done this is practice?
 

Did I understand you correctly based on these two points above?

First, no.  Second, yes.  Is my clarification adequate?
 
Levente

>
>
>
>       Levente
>
>       > - CurrentReadOnlySourceFiles and cacheDuring: can be discarded
>       >
>       >
>       >
>       >
>       >
>       >       Levente
>       >
>       >       >
>       >       >> Levente
>       >       >
>       >       > Eliot
>       >
>       >
>       >
>       > --
>       > _,,,^..^,,,_
>       > best, Eliot
>       >
>       >
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>


--
_,,,^..^,,,_
best, Eliot


Reply | Threaded
Open this post in threaded view
|

Re: CurrenReadOnlySourceFiles (was: Re: Question about inlining | How to access named temps in FullBlockClosure?)

Levente Uzonyi
Hi Eliot,

On Wed, 1 Apr 2020, Eliot Miranda wrote:

> Hi Levente,
>
> On Tue, Mar 31, 2020 at 9:25 PM Levente Uzonyi <[hidden email]> wrote:
>       Hi Eliot,
>
>       On Tue, 31 Mar 2020, Eliot Miranda wrote:
>
>       > Hi Levente,
>       >
>       > On Tue, Mar 31, 2020 at 7:38 PM Levente Uzonyi <[hidden email]> wrote:
>       >       Hi Eliot,
>       >
>       >       On Tue, 31 Mar 2020, Eliot Miranda wrote:
>       >
>       >       > Hi Levente,
>       >       >
>       >       > On Tue, Mar 31, 2020 at 3:33 PM Levente Uzonyi <[hidden email]> wrote:
>       >       >       Hi Eliot,
>       >       >
>       >       >       On Mon, 30 Mar 2020, Eliot Miranda wrote:
>       >       >
>       >       >       > Hi Levente,
>       >       >       >
>       >       >       >> On Mar 30, 2020, at 2:21 PM, Levente Uzonyi <[hidden email]> wrote:
>       >       >       >>
>       >       >       >> Hi Eliot,
>       >       >       >>
>       >       >       >>> On Mon, 30 Mar 2020, Eliot Miranda wrote:
>       >       >       >>>
>       >       >       >>> Well, that's not what I meant by a search.  However, as Levente pointed out, textual searches should be surrounded with CurrentReadOlySouceFiles cacheDuring:.  I think this is an awful implementation
>       >       and would
>       >       >       implement it
>       >       >       >>> very differently but that's the work-around we have in place now,
>       >       >       >>
>       >       >       >> How would you implement it?
>       >       >       >>
>       >       >       >> <history>
>       >       >       >> When I introduced CurrentReadOnlySourceFiles, I wanted to solve the issue of concurrent access to the source files.
>       >       >       >> I had the following options:
>       >       >       >> 1) controlled access to a shared resource (a single read-only copy of the SourceFiles array) with e.g. a Monitor
>       >       >       >> 2) same as 1) but with multiple copies pooled
>       >       >       >> 3) exceptions to define the scope and lifetime of the resources (the read-only copies) within a process
>       >       >       >>
>       >       >       >> I chose the third option because it was possible to introduce it without exploring and rewriting existing users: you could leave all code as-is
>       >       >       >> and sprinke CurrentReadOnlySourceFiles cacheDuring: [ ... ] around code that needed better performance.
>       >       >       >> It's obviously not a perfect solution, but I still think it was the best available at the time.
>       >       >       >>
>       >       >       >> Later ProcessLocalVariables were added to Trunk. Which could be used to solve the concurrent access issue by using process-local copies of the source files. The only challenge is to release them after
>       >       they are
>       >       >       not needed any more. Perhaps a high priority process could do that after a few minutes of inactivity. Or we could just let them linger and see if they cause any problems.
>       >       >       >> </history>
>       >       >       >
>       >       >       > I think the key issue (& this from a discussion here with Bert) is access time source in the debugger while one is debugging file access.  As the debugger asks for source so the file pointer is moved and
>       >       hence
>       >       >       screws up the access one is trying to debug.
>       >       >
>       >       >       I don't think that's the only issue. Have a look at the senders of
>       >       >       #readOnlyCopy. Many of them were added 10+ years ago, well before
>       >       >       CurrentReadOnlySourceFiles was introduced. Most of those could use
>       >       >       CurrentReadOnlySourceFiles too but are unrelated to the debugger.
>       >       >
>       >       >
>       >       > Yes, but IIRC that issue was to separate the writable file from the read-only file.  I remember dealing with this when working on Newspeak in 2007/2008. So SourceFiles can easily maintain a writable file and a
>       >       read-only copy
>       >       > of the file for both sources and changes and do writes through the writable one.
>       >       >
>       >       >
>       >       >       >
>       >       >       > So I would provide something like
>       >       >       >   SourceFiles withSubstituteCopiesWhile: aBlock
>       >       >       > which would install either copies of the files or read-only copies of the files for the duration of the block, and have the debugger use the method around its access to method source.
>       >       >       >
>       >       >       > The IDE is essentially single threaded as far as code modification goes, even if this isn’t enforced. There is no serialization on adding/removing methods and concurrent access can corrupt method
>       >       dictionaries,
>       >       >       and that limitation is fine in practice.  So having the same restriction on source file access is fine too (and in fact I think the restriction already applies; if one were to fork compiles then source
>       >       update to
>       >       >       the changes file could get corrupted too).
>       >       >       >
>       >       >       > So I think not using read-only copies to read source, and having the debugger use copies for its access would be a good lightweight solution.
>       >       >
>       >       >       I agree with what you wrote about method changes, but reading the sources
>       >       >       concurrently is still a possibility, especially when multiple UI processes
>       >       >       can exist at the same time (e.g. that's what opening a debugger does,
>       >       >       right?).
>       >       >
>       >       >
>       >       > My assertion is that the IDE is essentially single0-threaded and this doesn't;t have to be supported.  In any case, concurrent access will work if processes of the same priority level are cooperating.  But I
>       >       just answered the
>       >       > debugger issue.  I'm assuming that the debugger guards all its source access by substituting a different file.  So it, and only it, accesses the sources files through copies, and it, and only it pays the cost
>       >       for substituting
>       >       > the copies.  Normal queries can use a single read-only copy.  That gives us the functionality of cacheDuring: without having to invoke it.
>       >
>       >       The IDE is single-threaded but source files may be read outside the
>       >       context of the IDE.
>       >
>       >
>       > Can you give me a for instance.  I simply don't believe you.  And even its it's true I don't see that it has to be supported.  Please don't be vague.  This is important.
>
>       For example, Seaside has a web-based code browser. The webserver, no
>       matter which one is used by Seaside, will read the code from a process
>       different than the UI process.
>
>
> Yes, but there's still no implication that source access should be thread-safe.  The Seaside access to the IDE is still happening in the context of a cooperatively threaded Smalltalk, and there are places in Seaside where access to the IDE could be serialized without relying on support for thread safe source
> access when there is no thread-safe access to adding/removing methods.  So for the Seaside browser to function properly synchronization needs to be added to the general interface between Seaside and the IDE, not just source access.  For example, if we had a Seaside Squeak IDE server that allowed sharing between
> multiple programmers I suggest that the right way to serialize access is to provide some kind of synchronized queue between Seaside and the IDE, not to try and make the IDE thread-safe.  Updating things like class definitions, which potentially imply recompiling all methods in a class hierarchy require that no
> other modifications to the class hierarchy are occrring while a class and its subclasses are being redefined.
Even though it's extremely unlikely, browsing code (not modifying, just
viewing) with the Seaside Code Browser has a potential to mess up the
source files in the image if source access is not thread-safe.

>
>
>       >       > So let me reiterate.
>       >       >
>       >       > SourceFiles is modified to have a single writable version of the changes file and a single read-only version of sources nd changes files.  Source code is read through the readable copy and new source written
>       >       through the
>       >       > writable copy.  Whenever the debugger accesses source it does so through a method that first saves the files, substitutes copies in SourcesFiles, evaluates its block (that will access source through the copies),
>       >       and then
>       >       > ensures that the original files are restored.  There can be error checking for writing to the changes file in the debugger while writes are in progress to the original writable changes file, although I'm not
>       >       sure this is
>       >       > necessary; folks debugging source file access usually know what they're doing.
>       >       >
>       >       > The result is that
>       >       > - normal source reading does not require creating a read-only copy; it already exists.
>       >
>       >       Do you mean that the existence of #readOnlyCopy is satisfactory?.
>       >
>       >
>       > Yes.
>       >  
>       >       Creating a copy for every single file access is painfully slow.
>       >
>       >
>       > Exactly.
>       >  
>       >       CurrentReadOnlySourceFiles only exists to remedy that by reusing the same
>       >       read-only copy.
>       >
>       >
>       > I feel like you're not understanding my proposal.  Apologies if I'm presuming.  With my proposal the only time a new copy is created is when the debugger wants to access the source of a method.  That happens on the order of
>       > seconds, not microseconds as happens when scanning for source.
>
>       You say that each tool should use the same shared file streams. If so,
>       that implies that all sends of #readOnlyCopy have unnecessarily been added
>       over the years except for those which are related to the debugger.
>
>
> That's my opinion yes.
>  
>       And if I were to remove them along with CurrentReadOnlySourceFiles,
>       everything would stay normal, right?
>
>
> I think so.  Provided the debugger accesses source carefully, everything should be OK.  It's certainly worth an experiment, right?
I gave it a try and removed all references to CurrentReadOnlySourceFiles
with a global copy of SourceFiles: ReadOnlySourceFiles in my image.
Nothing broken so far, but the system is extremely fragile:
- saving the image with a different name will leave the read-only copies
intact
- starting up the image will use old file references
- even if the debugger worked properly, opening an image with debuggers
left open on code related to source file handling will cause problems

Of course, those problems can be tackled, but the conclusion is that
it's probably not a good idea to use a separate global for the read-only
source files. The best seems to be to hold them in the SourceFiles object
(a SourceFileArray).

I did a quick search in the mail archives, and found this post of yours on
the Pharo list:
http://forum.world.st/From-a-mooc-user-method-source-with-it-take-so-long-in-Pharo-5-tp4895802p4896670.html

You suggested introducing a new method:

  SourceFiles substituteFreshReadOnlyCopiesDuring: [...file access...]

If the file access is restricted to the block, and the block doesn't
access global state related to the source files, then it's possible to
make the read-only file access thread-safe with no additional cost:
- hold the read-only copies in a new instance variable of SourceFileArray
(e.g. readOnlyFiles)
- only pass the read-only copies to the block, don't expose them globally
- before the block is evaluated, save readOnlyFiles in a temporary, nil
out the readOnlyFiles variable, pass the temporary to the block
- after the block has been evaluated, check whether readOnlyFiles is nil.
If it's not nil, close and throw away the files passed to the block. If
it's nil, check whether they still point to the same files as the actual
source files. If yes, save them in the readOnlyFiles variable, else close
and throw away.
That way there's a globally shared read-only copy of the source files that
is guaranteed to be used only by one process at a time. If two or more
processes need the read only copies at the same time, new read only copies
will be created ensuring thread-safety.

>
>  
>       >       > - the debugger does not interfere with source access because it is careful to use copies and leave the originals undisturbed
>       >
>       >       That's exactly what I tried to imply by stating there being no problem
>       >       with the debugger before the mass use of read-only copies were introduced.
>       >
>       >
>       > Can you not see that in any scheme there is the potential for chaos if the debugger is accessing the source as one steps through code of methods that themselves are in the process of accessing source?  And so it is key that
>       > the debugger not* perturb the system when it itself accesses source?
>       >
>       > I'm confused.  We seem to be talking past each other.  I feel like you're blocking a reasonable proposal but I don't really understand what your objections are.  I apologize.  I'm not trying to be confrontational, but I do
>       > think my proposal is important and has merit and I feel frustrated by you because I can't quite understand why you're against it.  If you can identify a serious flaw I'll happily abandon it.  But I need to understand the flaw
>       > first.
>
>       I think that we have different ideas about what the problem is:
>
>       1) You say that the debugger is a key source of problems right now.
>       I say that I'm not aware of issues with the debugger. I know what the
>       potential problem could be, but I don't know whether it exists or not
>       right now because I have not seen any issues with the debugger lately.
>       If you have a reproducible case, please share it here.
>
>
> No I'm not saying that the debugger is a source of problems.  I'm saying that if one wants to debug source access then w.r.t. source access the debugger must not interfere.  Since the debugger necessarily accesses source as it is used, it must not perturb source access while it is being used to debug source
> access.
>
> Let me be clear.  Let's say we want to step through
>
>       (Object >> #at:) timeStamp
>
> as we stepped through the debugger would access several methods in CompiledMethod and FileStream.  If we only had one read-only copy for the source file then the potential exists for the source pointer of the read-only copy to be changed after it has been set to point at the chunk for Object>>#at:, and hence get
> the wrong answer, maybe answering the timeStamp for CompiledMethod>>preamble or some such.  So every time the debugger accesses source it must be careful not to disturb the single read-only copy of the source file.  It can easily do this by using a private copy of the source file.
>
>       2) You say that the source files can safely be treated as
>       only-to-be-used-by-the-IDE. I say that's not the case. IMO, I should
>       be able to fork a process scanning the sources and do something else while
>       it's processing the code. I also think that external tools like Seaside
>       should be able to read the source files without messing up the image.
>
>
> Have you ever done this is practice?
Probably yes, but I can't remember the exact situation.
I remembered that SmallLint or one of its descendants do the linting in
the background. I checked swalint, and found that it uses the UI process.
So, it seems that only the Pharo version uses background processes.


Levente

>  
>
>       Did I understand you correctly based on these two points above?
>
>
> First, no.  Second, yes.  Is my clarification adequate?
>  
>       Levente
>
>       >
>       >
>       >
>       >       Levente
>       >
>       >       > - CurrentReadOnlySourceFiles and cacheDuring: can be discarded
>       >       >
>       >       >
>       >       >
>       >       >
>       >       >
>       >       >       Levente
>       >       >
>       >       >       >
>       >       >       >> Levente
>       >       >       >
>       >       >       > Eliot
>       >       >
>       >       >
>       >       >
>       >       > --
>       >       > _,,,^..^,,,_
>       >       > best, Eliot
>       >       >
>       >       >
>       >
>       >
>       >
>       > --
>       > _,,,^..^,,,_
>       > best, Eliot
>       >
>       >
>
>
>
> --
> _,,,^..^,,,_
> best, Eliot
>
>

12