readOnlyFileNamed:do: vs. ...

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

readOnlyFileNamed:do: vs. ...

Stéphane Ducasse
Nicolas I saw that in squeak you changed and use readOnlyFileNamed:do:
I like it but I would like to understand what is the key advantage


fromFileNamed: aName
      FileStream readOnlyFileNamed: aName do: [:stream |
               stream setConverterForCode.
               self fileInFrom: stream]

fromFileNamed: aName
       | stream |
       stream := FileStream readOnlyFileNamed: aName.
       stream setConverterForCode.
       [self fileInFrom: stream] ensure: [stream close].

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Henrik Sperre Johansen
My guess is he uses a method which does not register the file in
FileRegistry, thus avoids a WeakDict cleanup when he closes the file
after the block has been evaluated.
(eg readOnlyFileNamed:do: vs. readOnlyFileNamed:)
Thus, a lot of time is saved from not doing weak registry cleanup.
(Which can be a large chunk of the total time when you do many small
operations).

Cheers,
Henry

Den 02.02.2010 10:50, skrev Stéphane Ducasse:

> Nicolas I saw that in squeak you changed and use readOnlyFileNamed:do:
> I like it but I would like to understand what is the key advantage
>
>
> fromFileNamed: aName
>       FileStream readOnlyFileNamed: aName do: [:stream |
>                stream setConverterForCode.
>                self fileInFrom: stream]
>
> fromFileNamed: aName
>        | stream |
>        stream := FileStream readOnlyFileNamed: aName.
>        stream setConverterForCode.
>        [self fileInFrom: stream] ensure: [stream close].
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
>  

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Henrik Sperre Johansen
Scratch that. It's what I would have done with a fileNamed:do: selector :)

The main difference is readOnlyFileNamed:do: does not evaluate the block
if file creation returns nil, while the second code errors out with
stream dnu setConverterFromCode if the file could not be opened for any
reason.

Cheers,
Henry

Den 02.02.2010 11:02, skrev Henrik Johansen:

> My guess is he uses a method which does not register the file in
> FileRegistry, thus avoids a WeakDict cleanup when he closes the file
> after the block has been evaluated.
> (eg readOnlyFileNamed:do: vs. readOnlyFileNamed:)
> Thus, a lot of time is saved from not doing weak registry cleanup.
> (Which can be a large chunk of the total time when you do many small
> operations).
>
> Cheers,
> Henry
>
> Den 02.02.2010 10:50, skrev Stéphane Ducasse:
>  
>> Nicolas I saw that in squeak you changed and use readOnlyFileNamed:do:
>> I like it but I would like to understand what is the key advantage
>>
>>
>> fromFileNamed: aName
>>       FileStream readOnlyFileNamed: aName do: [:stream |
>>                stream setConverterForCode.
>>                self fileInFrom: stream]
>>
>> fromFileNamed: aName
>>        | stream |
>>        stream := FileStream readOnlyFileNamed: aName.
>>        stream setConverterForCode.
>>        [self fileInFrom: stream] ensure: [stream close].
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>
>>  
>>    
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
>  

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Schwab,Wilhelm K
Henry,

I would not toss your first thought too quickly.  Staying clear of finalization is a big benefit.  I have never been able to trust it (for files) on Dolphin, which has a remarkably better implementation of weak collections and finalization than Squeak/Pharo.

Bill



-----Original Message-----
From: [hidden email] [mailto:[hidden email]] On Behalf Of Henrik Johansen
Sent: Tuesday, February 02, 2010 5:08 AM
To: [hidden email]
Subject: Re: [Pharo-project] readOnlyFileNamed:do: vs. ...

Scratch that. It's what I would have done with a fileNamed:do: selector :)

The main difference is readOnlyFileNamed:do: does not evaluate the block if file creation returns nil, while the second code errors out with stream dnu setConverterFromCode if the file could not be opened for any reason.

Cheers,
Henry

Den 02.02.2010 11:02, skrev Henrik Johansen:

> My guess is he uses a method which does not register the file in
> FileRegistry, thus avoids a WeakDict cleanup when he closes the file
> after the block has been evaluated.
> (eg readOnlyFileNamed:do: vs. readOnlyFileNamed:) Thus, a lot of time
> is saved from not doing weak registry cleanup.
> (Which can be a large chunk of the total time when you do many small
> operations).
>
> Cheers,
> Henry
>
> Den 02.02.2010 10:50, skrev Stéphane Ducasse:
>  
>> Nicolas I saw that in squeak you changed and use readOnlyFileNamed:do:
>> I like it but I would like to understand what is the key advantage
>>
>>
>> fromFileNamed: aName
>>       FileStream readOnlyFileNamed: aName do: [:stream |
>>                stream setConverterForCode.
>>                self fileInFrom: stream]
>>
>> fromFileNamed: aName
>>        | stream |
>>        stream := FileStream readOnlyFileNamed: aName.
>>        stream setConverterForCode.
>>        [self fileInFrom: stream] ensure: [stream close].
>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>
>>  
>>    
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
>
>  

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Henrik Sperre Johansen


Den 02.02.2010 13:00, skrev Schwab,Wilhelm K:
> Henry,
>
> I would not toss your first thought too quickly.  Staying clear of finalization is a big benefit.  I have never been able to trust it (for files) on Dolphin, which has a remarkably better implementation of weak collections and finalization than Squeak/Pharo.
>
> Bill
>  

Stephane's question was not what it should do differently though, but
what it did do differently ;)

For information:
(1 to: 3) collect: [:h | [CompiledMethod allInstancesDo: [:each | each
getSourceFromFile]] timeToRun]

Standard way, which both registers in the FileRegistry, and does close
at end of getSourceFromFile:
#(13348 12786 12712)

Explicit closing with readOnlyFileNamed:do:
 #(12091 12061 12189)

Considering the amount of work done in reading each source, that's not
entirely insignificant,
but it requires quite a large amount of additional private protocol in
the FileStream classes...

Cheers,
Henry

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Henrik Sperre Johansen


Den 02.02.2010 13:14, skrev Henrik Johansen:

> For information:
> (1 to: 3) collect: [:h | [CompiledMethod allInstancesDo: [:each | each
> getSourceFromFile]] timeToRun]
>
> Standard way, which both registers in the FileRegistry, and does close
> at end of getSourceFromFile:
> #(13348 12786 12712)
>
> Explicit closing with readOnlyFileNamed:do:
>  #(12091 12061 12189)
>  
Blargh, numbers were without nextChunk update.

With latest version, the differences are:

Standard way, which both registers in the FileRegistry, and does close
at end of getSourceFromFile:
#(7697 7684 7694)

Explicit closing with readOnlyFileNamed:do:
 #(6586 6597 6594)

Cheers,
Henry



_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Stéphane Ducasse
In reply to this post by Henrik Sperre Johansen
Hi henrik

this is for fileStream
and I like the idea that this is more compact now reading the code of

fromFileNamed: aName
     FileStream readOnlyFileNamed: aName do: [:stream |
              stream setConverterForCode.
              self fileInFrom: stream]

readOnlyFileNamed: fileName do: aBlock
       
        ^ self detectFile: [ self readOnlyFileNamed: fileName ] do: aBlock

detectFile: aBlock do: anotherBlock
       
        | file |
        file := aBlock value.
        ^ file
                ifNil: [ nil ]
         ifNotNil: [ [anotherBlock value: file] ensure: [file close]]

It is less obvious to me that the ensure: will really close the file
if readOnlyFileNamed: fails.

Before it was clearly not the case
fromFileNamed: aName
      | stream |
      stream := FileStream readOnlyFileNamed: aName.
      stream setConverterForCode.
      [self fileInFrom: stream] ensure: [stream close].

so I would have written it
fromFileNamed: aName
      | stream |
      [stream := FileStream readOnlyFileNamed: aName.
      stream setConverterForCode.
      self fileInFrom: stream] ensure: [stream close].

what do you think?

Stef



On Feb 2, 2010, at 11:07 AM, Henrik Johansen wrote:

> Scratch that. It's what I would have done with a fileNamed:do: selector :)
>
> The main difference is readOnlyFileNamed:do: does not evaluate the block
> if file creation returns nil, while the second code errors out with
> stream dnu setConverterFromCode if the file could not be opened for any
> reason.
>
> Cheers,
> Henry
>
> Den 02.02.2010 11:02, skrev Henrik Johansen:
>> My guess is he uses a method which does not register the file in
>> FileRegistry, thus avoids a WeakDict cleanup when he closes the file
>> after the block has been evaluated.
>> (eg readOnlyFileNamed:do: vs. readOnlyFileNamed:)
>> Thus, a lot of time is saved from not doing weak registry cleanup.
>> (Which can be a large chunk of the total time when you do many small
>> operations).
>>
>> Cheers,
>> Henry
>>
>> Den 02.02.2010 10:50, skrev Stéphane Ducasse:
>>
>>> Nicolas I saw that in squeak you changed and use readOnlyFileNamed:do:
>>> I like it but I would like to understand what is the key advantage
>>>
>>>
>>>
>>>
>>>
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>>
>>>
>>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Stéphane Ducasse
In reply to this post by Henrik Sperre Johansen

On Feb 2, 2010, at 1:32 PM, Henrik Johansen wrote:

>
>
> Den 02.02.2010 13:14, skrev Henrik Johansen:
>> For information:
>> (1 to: 3) collect: [:h | [CompiledMethod allInstancesDo: [:each | each
>> getSourceFromFile]] timeToRun]
>>
>> Standard way, which both registers in the FileRegistry, and does close
>> at end of getSourceFromFile:
>> #(13348 12786 12712)
>>
>> Explicit closing with readOnlyFileNamed:do:
>> #(12091 12061 12189)
>>
> Blargh, numbers were without nextChunk update.
>
> With latest version, the differences are:
>
> Standard way, which both registers in the FileRegistry, and does close
> at end of getSourceFromFile:
> #(7697 7684 7694)
>
> Explicit closing with readOnlyFileNamed:do:
> #(6586 6597 6594)

interesting.

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Stéphane Ducasse
In reply to this post by Henrik Sperre Johansen
>
>> Henry,
>>
>> I would not toss your first thought too quickly.  Staying clear of finalization is a big benefit.  I have never been able to trust it (for files) on Dolphin, which has a remarkably better implementation of weak collections and finalization than Squeak/Pharo.
>>
>> Bill
>>
>
> Stephane's question was not what it should do differently though, but
> what it did do differently ;)
>
> For information:
> (1 to: 3) collect: [:h | [CompiledMethod allInstancesDo: [:each | each
> getSourceFromFile]] timeToRun]
>
> Standard way, which both registers in the FileRegistry, and does close
> at end of getSourceFromFile:
> #(13348 12786 12712)
>
> Explicit closing with readOnlyFileNamed:do:
> #(12091 12061 12189)
>
> Considering the amount of work done in reading each source, that's not
> entirely insignificant,
> but it requires quite a large amount of additional private protocol in
> the FileStream classes...

you lost me there can you explain why do you want to add private protocol to FileStream
_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Stéphane Ducasse
In reply to this post by Henrik Sperre Johansen
>> For information:
>> (1 to: 3) collect: [:h | [CompiledMethod allInstancesDo: [:each | each
>> getSourceFromFile]] timeToRun]
>>
>> Standard way, which both registers in the FileRegistry, and does close
>> at end of getSourceFromFile:


do you have the code for this one?

>>
> Standard way, which both registers in the FileRegistry, and does close
> at end of getSourceFromFile:
> #(7697 7684 7694)
>
> Explicit closing with readOnlyFileNamed:do:
> #(6586 6597 6594)
>
> Cheers,
> Henry
>
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Henrik Sperre Johansen
In reply to this post by Stéphane Ducasse
On 02.02.2010 18:35, Stéphane Ducasse wrote:

> Hi henrik
>
> this is for fileStream
> and I like the idea that this is more compact now reading the code of
>
> fromFileNamed: aName
>       FileStream readOnlyFileNamed: aName do: [:stream |
>                stream setConverterForCode.
>                self fileInFrom: stream]
>
> readOnlyFileNamed: fileName do: aBlock
>
> ^ self detectFile: [ self readOnlyFileNamed: fileName ] do: aBlock
>
> detectFile: aBlock do: anotherBlock
>
> | file |
> file := aBlock value.
> ^ file
> ifNil: [ nil ]
>           ifNotNil: [ [anotherBlock value: file] ensure: [file close]]
>
> It is less obvious to me that the ensure: will really close the file
> if readOnlyFileNamed: fails.
>
> Before it was clearly not the case
> fromFileNamed: aName
>        | stream |
>        stream := FileStream readOnlyFileNamed: aName.
>        stream setConverterForCode.
>        [self fileInFrom: stream] ensure: [stream close].
>
> so I would have written it
> fromFileNamed: aName
>        | stream |
>        [stream := FileStream readOnlyFileNamed: aName.
>        stream setConverterForCode.
>        self fileInFrom: stream] ensure: [stream close].
>
> what do you think?
>
> Stef
>    
Not really.
Either there's no error when you request the file be opened, and the
block works, or you don't, and there's nothing to close.
So as long as the block meant to open a file really does nothing else
but try to open a file, there's nothing wrong with the scope of the
ensure guard.
If you want to be able to use the detectFile:do: to write code like:
detectFile: [|fileStream|
                 fileStream := FileStream newFileNamed: 'test.txt'.
                 fileStream encoder: ISO88592TextConverter new.
                  fileStream] do: [:aStream doStuff]
the expanded guard would possibly be needed, to me that's giving aBlock
too much responsibility though, I'd rather it'd be forced written:
FileStream detectFile: [FileStream newFileNamed: 'test.txt']
                    do: [:aStream encoder: ISO88592TextConverter new;
doStuff]

Cheers,
Henry

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Stéphane Ducasse
>>
> Not really.
> Either there's no error when you request the file be opened, and the
> block works, or you don't, and there's nothing to close.
> So as long as the block meant to open a file really does nothing else
> but try to open a file, there's nothing wrong with the scope of the
> ensure guard.


Ok you mean that if the opening fails by its exact failure we do not have
to close?
-- ok I did not think about failures as equals to do not create an handle that we
should close :)

tx to open my eyes on the obvious

> If you want to be able to use the detectFile:do: to write code like:
> detectFile: [|fileStream|
>                 fileStream := FileStream newFileNamed: 'test.txt'.
>                 fileStream encoder: ISO88592TextConverter new.
>                  fileStream] do: [:aStream doStuff]
> the expanded guard would possibly be needed, to me that's giving aBlock
> too much responsibility though, I'd rather it'd be forced written:
> FileStream detectFile: [FileStream newFileNamed: 'test.txt']
>                    do: [:aStream encoder: ISO88592TextConverter new;
> doStuff]
>
> Cheers,
> Henry
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Stéphane Ducasse
I just published a nice slice and the code produced is nice really nice.

http://code.google.com/p/pharo/issues/list?cursor=1910

Stef

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

johnmci
In reply to this post by Schwab,Wilhelm K
Ok, I can speak from experience that the file clean up logic does work in Squeak because in Sophie we would open 100's of files in a big project and we did discover a bug in
the finalization logic that the FileStream uses to cleanup linkages after the copy of the object is made in the finalization logic.  

This was not the fault of how the finalization logic works, rather it was a bug in the user code.

Now the reason for the finalization on streams is that in most/all? file systems system resources are a finite resource, (see denial of service attacks).  Isn't it 1024 file handles in os-x?
If you allocate a file,work with it, then GC the stream without closing then Squeak might be fine, but the actual operating system resource is dangling.

So you need the finalization to cleanup, unless you can get the programmer always to promise to close the handle. Perhaps when an ensure:[] (always used everywhere?).
However the model usually used is open the file, and sometime in the future close it via some other logic path we can't force fit the programmer into an ensure: [] pattern.

So on a file handle open failure the code code runs a full GC because you could have the case where you have 900 handles allocated by GCed objects in OldSpace. Since
object in Old Space could be dead, but not realized as dead yet a full gc is required to clean them up, which triggers the finalization, which frees the zombie handles.

This is not a fault of the finalization logic, it's to do with having multiple collectors and wanting to minimize the effort put into looking for dead objects, until we really really have to...

So the logic on failure will trigger a full GC/finalization process, and then on the 2nd attempt  if required fail because you've actually run into a system resource issue, or some other fatal file open error.


I am wondering here if your benchmark results are clouded by having a fullGC run. I'd check fullGC count before/after to see how it's being changed.  


On 2010-02-02, at 4:00 AM, Schwab,Wilhelm K wrote:

> Henry,
>
> I would not toss your first thought too quickly.  Staying clear of finalization is a big benefit.  I have never been able to trust it (for files) on Dolphin, which has a remarkably better implementation of weak collections and finalization than Squeak/Pharo.
>
> Bill
>
>
>
> -----Original Message-----
> From: [hidden email] [mailto:[hidden email]] On Behalf Of Henrik Johansen
> Sent: Tuesday, February 02, 2010 5:08 AM
> To: [hidden email]
> Subject: Re: [Pharo-project] readOnlyFileNamed:do: vs. ...
>
> Scratch that. It's what I would have done with a fileNamed:do: selector :)
>
> The main difference is readOnlyFileNamed:do: does not evaluate the block if file creation returns nil, while the second code errors out with stream dnu setConverterFromCode if the file could not be opened for any reason.
>
> Cheers,
> Henry
>
> Den 02.02.2010 11:02, skrev Henrik Johansen:
>> My guess is he uses a method which does not register the file in
>> FileRegistry, thus avoids a WeakDict cleanup when he closes the file
>> after the block has been evaluated.
>> (eg readOnlyFileNamed:do: vs. readOnlyFileNamed:) Thus, a lot of time
>> is saved from not doing weak registry cleanup.
>> (Which can be a large chunk of the total time when you do many small
>> operations).
>>
>> Cheers,
>> Henry
>>
>> Den 02.02.2010 10:50, skrev Stéphane Ducasse:
>>
>>> Nicolas I saw that in squeak you changed and use readOnlyFileNamed:do:
>>> I like it but I would like to understand what is the key advantage
>>>
>>>
>>> fromFileNamed: aName
>>>      FileStream readOnlyFileNamed: aName do: [:stream |
>>>               stream setConverterForCode.
>>>               self fileInFrom: stream]
>>>
>>> fromFileNamed: aName
>>>       | stream |
>>>       stream := FileStream readOnlyFileNamed: aName.
>>>       stream setConverterForCode.
>>>       [self fileInFrom: stream] ensure: [stream close].
>>>
>>> _______________________________________________
>>> Pharo-project mailing list
>>> [hidden email]
>>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>>
>>>
>>>
>>>
>> _______________________________________________
>> Pharo-project mailing list
>> [hidden email]
>> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>>
>>
>>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project

--
===========================================================================
John M. McIntosh <[hidden email]>   Twitter:  squeaker68882
Corporate Smalltalk Consulting Ltd.  http://www.smalltalkconsulting.com
===========================================================================





_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Henrik Sperre Johansen
On 02.02.2010 23:06, John M McIntosh wrote:
> Ok, I can speak from experience that the file clean up logic does work in Squeak because in Sophie we would open 100's of files in a big project and we did discover a bug in
> the finalization logic that the FileStream uses to cleanup linkages after the copy of the object is made in the finalization logic.
>    
I have seen no problems with finalization not working (eventually, which
I guess is what Bill alluded to).
I 100% agree with having it as standard for the common file handle
creation methods.
There are cases such as existsFile:, newFileNamed:do: and friends, etc.,
where you only use a stream temporarily and can close it before the end
of the method's scope, not registering it in those cases does make a
difference. (Which is what the test was meant to show).
Whether it would actually be better to just not close it manually, and
wait for the finalization to do it instead, might be worth investigating.

> This was not the fault of how the finalization logic works, rather it was a bug in the user code.
>
> Now the reason for the finalization on streams is that in most/all? file systems system resources are a finite resource, (see denial of service attacks).  Isn't it 1024 file handles in os-x?
> If you allocate a file,work with it, then GC the stream without closing then Squeak might be fine, but the actual operating system resource is dangling.
>
> So you need the finalization to cleanup, unless you can get the programmer always to promise to close the handle. Perhaps when an ensure:[] (always used everywhere?).
> However the model usually used is open the file, and sometime in the future close it via some other logic path we can't force fit the programmer into an ensure: [] pattern.
>
> So on a file handle open failure the code code runs a full GC because you could have the case where you have 900 handles allocated by GCed objects in OldSpace. Since
> object in Old Space could be dead, but not realized as dead yet a full gc is required to clean them up, which triggers the finalization, which frees the zombie handles.
In the example, the stream is closed manually by the old version of
RemoteString text.
Thus, no finalization really triggers in this case, but the close still
triggers a rehash of the weak FileRegistry on every close.
>
>
> This is not a fault of the finalization logic, it's to do with having multiple collectors and wanting to minimize the effort put into looking for dead objects, until we really really have to...
>
> So the logic on failure will trigger a full GC/finalization process, and then on the 2nd attempt  if required fail because you've actually run into a system resource issue, or some other fatal file open error.
>
>
> I am wondering here if your benchmark results are clouded by having a fullGC run. I'd check fullGC count before/after to see how it's being changed.
>    

As stated above, the handles were closed manually, thus I'd wager the
finalization process never did get to run to clean them up.
Either way, as long as it consistently reproduces shorter runtimes, does
it really matter what the causes were?

Cheers,
Henry


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Stéphane Ducasse
Hi henrik

if you get a chance to have a look at the slice I uploaded it would be good.
I like the idea that people have a look at other code, which is what we do when we integrate
now for my own code I always feel nervous :)

Stef

On Feb 3, 2010, at 9:40 AM, Henrik Sperre Johansen wrote:

> On 02.02.2010 23:06, John M McIntosh wrote:
>> Ok, I can speak from experience that the file clean up logic does work in Squeak because in Sophie we would open 100's of files in a big project and we did discover a bug in
>> the finalization logic that the FileStream uses to cleanup linkages after the copy of the object is made in the finalization logic.
>>
> I have seen no problems with finalization not working (eventually, which
> I guess is what Bill alluded to).
> I 100% agree with having it as standard for the common file handle
> creation methods.
> There are cases such as existsFile:, newFileNamed:do: and friends, etc.,
> where you only use a stream temporarily and can close it before the end
> of the method's scope, not registering it in those cases does make a
> difference. (Which is what the test was meant to show).
> Whether it would actually be better to just not close it manually, and
> wait for the finalization to do it instead, might be worth investigating.
>> This was not the fault of how the finalization logic works, rather it was a bug in the user code.
>>
>> Now the reason for the finalization on streams is that in most/all? file systems system resources are a finite resource, (see denial of service attacks).  Isn't it 1024 file handles in os-x?
>> If you allocate a file,work with it, then GC the stream without closing then Squeak might be fine, but the actual operating system resource is dangling.
>>
>> So you need the finalization to cleanup, unless you can get the programmer always to promise to close the handle. Perhaps when an ensure:[] (always used everywhere?).
>> However the model usually used is open the file, and sometime in the future close it via some other logic path we can't force fit the programmer into an ensure: [] pattern.
>>
>> So on a file handle open failure the code code runs a full GC because you could have the case where you have 900 handles allocated by GCed objects in OldSpace. Since
>> object in Old Space could be dead, but not realized as dead yet a full gc is required to clean them up, which triggers the finalization, which frees the zombie handles.
> In the example, the stream is closed manually by the old version of
> RemoteString text.
> Thus, no finalization really triggers in this case, but the close still
> triggers a rehash of the weak FileRegistry on every close.
>>
>>
>> This is not a fault of the finalization logic, it's to do with having multiple collectors and wanting to minimize the effort put into looking for dead objects, until we really really have to...
>>
>> So the logic on failure will trigger a full GC/finalization process, and then on the 2nd attempt  if required fail because you've actually run into a system resource issue, or some other fatal file open error.
>>
>>
>> I am wondering here if your benchmark results are clouded by having a fullGC run. I'd check fullGC count before/after to see how it's being changed.
>>
>
> As stated above, the handles were closed manually, thus I'd wager the
> finalization process never did get to run to clean them up.
> Either way, as long as it consistently reproduces shorter runtimes, does
> it really matter what the causes were?
>
> Cheers,
> Henry
>
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Henrik Sperre Johansen
On 03.02.2010 10:26, Stéphane Ducasse wrote:
> Hi henrik
>
> if you get a chance to have a look at the slice I uploaded it would be good.
> I like the idea that people have a look at other code, which is what we do when we integrate
> now for my own code I always feel nervous :)
>
> Stef
>    
Got time tonight, here's my comments:

DigitalSignatureAlgorithm class >> testExamplesFromDisk
May be better as a real test than as a class-side method? The method
usage change seems fine though ;)

FilePackage >> fromFileNamed:
Should be inlined in the class-side method called the same, and removed.
Only this method is ever called anyways, and it makes less sense to
accept from... style initialization methods on the instances.


MCVersionReader class >> file:streamDo:
Could instead be inlined in versionFromFile which is the only sender,
then removed.
The way it's now, it's really only a MCVersionReader-specific alias for
the exact same method :)

MCVersionReader class >> versionFromFile:
     ^FileStream readOnlyFileNamed: fileName do: [:fileStream | self
versionFromStream: fileStream]


UUIDGenerator>>makeUnixSeed
You set the stream to binary, there should be no need to set the
converter as well (it won't be used in binary mode).

Other than that, the changes look ok to me, and make the methods easier
to read (imo)

Cheers,
Henry

_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project
Reply | Threaded
Open this post in threaded view
|

Re: readOnlyFileNamed:do: vs. ...

Stéphane Ducasse
Thanks henrik. I'm in the process of integrating the changes :)
I was just systematically converting. So I will create a new issue for these remarks.

Stef

On Feb 4, 2010, at 10:02 PM, Henrik Sperre Johansen wrote:

> On 03.02.2010 10:26, Stéphane Ducasse wrote:
>> Hi henrik
>>
>> if you get a chance to have a look at the slice I uploaded it would be good.
>> I like the idea that people have a look at other code, which is what we do when we integrate
>> now for my own code I always feel nervous :)
>>
>> Stef
>>
> Got time tonight, here's my comments:
>
> DigitalSignatureAlgorithm class >> testExamplesFromDisk
> May be better as a real test than as a class-side method? The method
> usage change seems fine though ;)
>
> FilePackage >> fromFileNamed:
> Should be inlined in the class-side method called the same, and removed.
> Only this method is ever called anyways, and it makes less sense to
> accept from... style initialization methods on the instances.
>
>
> MCVersionReader class >> file:streamDo:
> Could instead be inlined in versionFromFile which is the only sender,
> then removed.
> The way it's now, it's really only a MCVersionReader-specific alias for
> the exact same method :)
>
> MCVersionReader class >> versionFromFile:
>     ^FileStream readOnlyFileNamed: fileName do: [:fileStream | self
> versionFromStream: fileStream]
>
>
> UUIDGenerator>>makeUnixSeed
> You set the stream to binary, there should be no need to set the
> converter as well (it won't be used in binary mode).
>
> Other than that, the changes look ok to me, and make the methods easier
> to read (imo)
>
> Cheers,
> Henry
>
> _______________________________________________
> Pharo-project mailing list
> [hidden email]
> http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project


_______________________________________________
Pharo-project mailing list
[hidden email]
http://lists.gforge.inria.fr/cgi-bin/mailman/listinfo/pharo-project