FileSystem file attributes and #isSymlink patch

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|

FileSystem file attributes and #isSymlink patch

alistairgrant
Hi All,

I'm nearly ready to submit a patch that started with the goal of being
able to retrieve the device id and fixing FileReference>>isSymlink and
also follows Esteban's suggestion of splitting out file existence and
other attributes (which provides a performace gain).  See the summary
below for a description of the changes.

The patch involves adding a new VM plugin, FileAttriubutesPlugin.  To
minimise the chance of any problems along the way I'd like to submit the
patch in three steps:

1. Add the VM plugin (FileAttributesPlugin)
2. Add the code the image that allows testing of the code and plugin,
   but doesn't do any integration with existing functionality.
   This will allow the plugin to be tested by a few volunteers
   (hopefully).
3. Add the patches that make the switch over to the new implementation.


Can someone point me to how to submit a new VM plugin?  The code is
contained in a subclass of InterpreterPlugin.

I've been using this as my production environment for about 2 months now
on a linux 32 bit VM.  Running the full test suite results in the same
set of test failing before and after applying the patch.

I've also ran file related tests on linux 64 bit (run the Test Runner,
select all packages with "file" as part of the name and run all the
available tests) and the full test suite on Windows 32 bit.

The summary is:

1. #isSymlink now works properly on Linux (and it should also work on
   MacOS and BSD).
2. The list of file attributes available from FileReference now is:
        #accessTime (new)
        #changeTime (new)
        #creationTime
        #deviceId (new)
        #exists
        #gid (new)
        #inode (new)
        #isBlock (new)
        #isCharacter (new)
        #isDirectory
        #isExecutable (new)
        #isFIFO (new)
        #isFile
        #isReadable
        #isRegular (new)
        #isSocket (new)
        #isSymlink (works)
        #isWritable
        #modificationTime
        #numberOfHardLinks (new)
        #permissions
        #size
        #targetFile (new)
        #uid (new)
3. FileReference>>exists is faster than before (well, at least on my
   linux laptop).  This is useful as it is called quite often.
4. It is possible to retrieve symbolic link attributes, e.g. all the
   attributes listed above plus the target path.


Given how similar MacOS and BSD are to linux, I assume that this will
all work without problem on those platforms (but it obviously needs to
be tested).

As implied above, the changes are all backward compatible (except
the broken #isSymlink), although a couple deserve mention:

1. Obviously #isSymlink now answers correctly (previously it would only
   answer correctly for a broken link).
2. Requesting any of the attributes listed above (except #isSymlink)
   will return the value of the target path.  If the FileReference is to a
   broken symbolic link, it will return the attributes of the symbolic
   link (keeping existing behaviour).
3. The attributes of a symbolic link can be retrieved using
   FileReference>>symlinkAttributes.

Overall, performance is slightly better than before.  Code that
needs to access multiple attributes and is written to take advantage of
the new behaviour will see significant performance improvements.


If you've got this far and forgotten the original question :-)

Can someone point me to how to submit a new VM plugin?


Thanks,
Alistair

Reply | Threaded
Open this post in threaded view
|

Re: FileSystem file attributes and #isSymlink patch

Max Leske
Nice work Alistair!

 

On 24 July 2017 at 10:10:09, Alistair Grant ([hidden email]) wrote:

Hi All,

I'm nearly ready to submit a patch that started with the goal of being
able to retrieve the device id and fixing FileReference>>isSymlink and
also follows Esteban's suggestion of splitting out file existence and
other attributes (which provides a performace gain). See the summary
below for a description of the changes.

The patch involves adding a new VM plugin, FileAttriubutesPlugin. To
minimise the chance of any problems along the way I'd like to submit the
patch in three steps:

1. Add the VM plugin (FileAttributesPlugin)
2. Add the code the image that allows testing of the code and plugin,
but doesn't do any integration with existing functionality.
This will allow the plugin to be tested by a few volunteers
(hopefully).
3. Add the patches that make the switch over to the new implementation.


Can someone point me to how to submit a new VM plugin? The code is
contained in a subclass of InterpreterPlugin.

I've been using this as my production environment for about 2 months now
on a linux 32 bit VM. Running the full test suite results in the same
set of test failing before and after applying the patch.

I've also ran file related tests on linux 64 bit (run the Test Runner,
select all packages with "file" as part of the name and run all the
available tests) and the full test suite on Windows 32 bit.

The summary is:

1. #isSymlink now works properly on Linux (and it should also work on
MacOS and BSD).
2. The list of file attributes available from FileReference now is:
#accessTime (new)
#changeTime (new)
#creationTime
#deviceId (new)
#exists
#gid (new)
#inode (new)
#isBlock (new)
#isCharacter (new)
#isDirectory
#isExecutable (new)
#isFIFO (new)
#isFile
#isReadable
#isRegular (new)
#isSocket (new)
#isSymlink (works)
#isWritable
#modificationTime
#numberOfHardLinks (new)
#permissions
#size
#targetFile (new)
#uid (new)
3. FileReference>>exists is faster than before (well, at least on my
linux laptop). This is useful as it is called quite often.
4. It is possible to retrieve symbolic link attributes, e.g. all the
attributes listed above plus the target path.


Given how similar MacOS and BSD are to linux, I assume that this will
all work without problem on those platforms (but it obviously needs to
be tested).

As implied above, the changes are all backward compatible (except
the broken #isSymlink), although a couple deserve mention:

1. Obviously #isSymlink now answers correctly (previously it would only
answer correctly for a broken link).
2. Requesting any of the attributes listed above (except #isSymlink)
will return the value of the target path. If the FileReference is to a
broken symbolic link, it will return the attributes of the symbolic
link (keeping existing behaviour).
3. The attributes of a symbolic link can be retrieved using
FileReference>>symlinkAttributes.

Overall, performance is slightly better than before. Code that
needs to access multiple attributes and is written to take advantage of
the new behaviour will see significant performance improvements.


If you've got this far and forgotten the original question :-)

Can someone point me to how to submit a new VM plugin?


Thanks,
Alistair

Reply | Threaded
Open this post in threaded view
|

Re: FileSystem file attributes and #isSymlink patch

Stephane Ducasse-3
HI alistair

this is supercool that you push this!
I mean it for real. Because this simplnk stuff was always getting on
our way for scripts.
This is really great to see this happening.
Stef

On Mon, Jul 24, 2017 at 10:43 AM, Max Leske <[hidden email]> wrote:

> Nice work Alistair!
>
>
>
> On 24 July 2017 at 10:10:09, Alistair Grant ([hidden email]) wrote:
>
> Hi All,
>
> I'm nearly ready to submit a patch that started with the goal of being
> able to retrieve the device id and fixing FileReference>>isSymlink and
> also follows Esteban's suggestion of splitting out file existence and
> other attributes (which provides a performace gain). See the summary
> below for a description of the changes.
>
> The patch involves adding a new VM plugin, FileAttriubutesPlugin. To
> minimise the chance of any problems along the way I'd like to submit the
> patch in three steps:
>
> 1. Add the VM plugin (FileAttributesPlugin)
> 2. Add the code the image that allows testing of the code and plugin,
> but doesn't do any integration with existing functionality.
> This will allow the plugin to be tested by a few volunteers
> (hopefully).
> 3. Add the patches that make the switch over to the new implementation.
>
>
> Can someone point me to how to submit a new VM plugin? The code is
> contained in a subclass of InterpreterPlugin.
>
> I've been using this as my production environment for about 2 months now
> on a linux 32 bit VM. Running the full test suite results in the same
> set of test failing before and after applying the patch.
>
> I've also ran file related tests on linux 64 bit (run the Test Runner,
> select all packages with "file" as part of the name and run all the
> available tests) and the full test suite on Windows 32 bit.
>
> The summary is:
>
> 1. #isSymlink now works properly on Linux (and it should also work on
> MacOS and BSD).
> 2. The list of file attributes available from FileReference now is:
> #accessTime (new)
> #changeTime (new)
> #creationTime
> #deviceId (new)
> #exists
> #gid (new)
> #inode (new)
> #isBlock (new)
> #isCharacter (new)
> #isDirectory
> #isExecutable (new)
> #isFIFO (new)
> #isFile
> #isReadable
> #isRegular (new)
> #isSocket (new)
> #isSymlink (works)
> #isWritable
> #modificationTime
> #numberOfHardLinks (new)
> #permissions
> #size
> #targetFile (new)
> #uid (new)
> 3. FileReference>>exists is faster than before (well, at least on my
> linux laptop). This is useful as it is called quite often.
> 4. It is possible to retrieve symbolic link attributes, e.g. all the
> attributes listed above plus the target path.
>
>
> Given how similar MacOS and BSD are to linux, I assume that this will
> all work without problem on those platforms (but it obviously needs to
> be tested).
>
> As implied above, the changes are all backward compatible (except
> the broken #isSymlink), although a couple deserve mention:
>
> 1. Obviously #isSymlink now answers correctly (previously it would only
> answer correctly for a broken link).
> 2. Requesting any of the attributes listed above (except #isSymlink)
> will return the value of the target path. If the FileReference is to a
> broken symbolic link, it will return the attributes of the symbolic
> link (keeping existing behaviour).
> 3. The attributes of a symbolic link can be retrieved using
> FileReference>>symlinkAttributes.
>
> Overall, performance is slightly better than before. Code that
> needs to access multiple attributes and is written to take advantage of
> the new behaviour will see significant performance improvements.
>
>
> If you've got this far and forgotten the original question :-)
>
> Can someone point me to how to submit a new VM plugin?
>
>
> Thanks,
> Alistair
>

Reply | Threaded
Open this post in threaded view
|

Re: FileSystem file attributes and #isSymlink patch

Tudor Girba-2
Thank you very much, indeed!

Doru


> On Jul 24, 2017, at 10:04 PM, Stephane Ducasse <[hidden email]> wrote:
>
> HI alistair
>
> this is supercool that you push this!
> I mean it for real. Because this simplnk stuff was always getting on
> our way for scripts.
> This is really great to see this happening.
> Stef
>
> On Mon, Jul 24, 2017 at 10:43 AM, Max Leske <[hidden email]> wrote:
>> Nice work Alistair!
>>
>>
>>
>> On 24 July 2017 at 10:10:09, Alistair Grant ([hidden email]) wrote:
>>
>> Hi All,
>>
>> I'm nearly ready to submit a patch that started with the goal of being
>> able to retrieve the device id and fixing FileReference>>isSymlink and
>> also follows Esteban's suggestion of splitting out file existence and
>> other attributes (which provides a performace gain). See the summary
>> below for a description of the changes.
>>
>> The patch involves adding a new VM plugin, FileAttriubutesPlugin. To
>> minimise the chance of any problems along the way I'd like to submit the
>> patch in three steps:
>>
>> 1. Add the VM plugin (FileAttributesPlugin)
>> 2. Add the code the image that allows testing of the code and plugin,
>> but doesn't do any integration with existing functionality.
>> This will allow the plugin to be tested by a few volunteers
>> (hopefully).
>> 3. Add the patches that make the switch over to the new implementation.
>>
>>
>> Can someone point me to how to submit a new VM plugin? The code is
>> contained in a subclass of InterpreterPlugin.
>>
>> I've been using this as my production environment for about 2 months now
>> on a linux 32 bit VM. Running the full test suite results in the same
>> set of test failing before and after applying the patch.
>>
>> I've also ran file related tests on linux 64 bit (run the Test Runner,
>> select all packages with "file" as part of the name and run all the
>> available tests) and the full test suite on Windows 32 bit.
>>
>> The summary is:
>>
>> 1. #isSymlink now works properly on Linux (and it should also work on
>> MacOS and BSD).
>> 2. The list of file attributes available from FileReference now is:
>> #accessTime (new)
>> #changeTime (new)
>> #creationTime
>> #deviceId (new)
>> #exists
>> #gid (new)
>> #inode (new)
>> #isBlock (new)
>> #isCharacter (new)
>> #isDirectory
>> #isExecutable (new)
>> #isFIFO (new)
>> #isFile
>> #isReadable
>> #isRegular (new)
>> #isSocket (new)
>> #isSymlink (works)
>> #isWritable
>> #modificationTime
>> #numberOfHardLinks (new)
>> #permissions
>> #size
>> #targetFile (new)
>> #uid (new)
>> 3. FileReference>>exists is faster than before (well, at least on my
>> linux laptop). This is useful as it is called quite often.
>> 4. It is possible to retrieve symbolic link attributes, e.g. all the
>> attributes listed above plus the target path.
>>
>>
>> Given how similar MacOS and BSD are to linux, I assume that this will
>> all work without problem on those platforms (but it obviously needs to
>> be tested).
>>
>> As implied above, the changes are all backward compatible (except
>> the broken #isSymlink), although a couple deserve mention:
>>
>> 1. Obviously #isSymlink now answers correctly (previously it would only
>> answer correctly for a broken link).
>> 2. Requesting any of the attributes listed above (except #isSymlink)
>> will return the value of the target path. If the FileReference is to a
>> broken symbolic link, it will return the attributes of the symbolic
>> link (keeping existing behaviour).
>> 3. The attributes of a symbolic link can be retrieved using
>> FileReference>>symlinkAttributes.
>>
>> Overall, performance is slightly better than before. Code that
>> needs to access multiple attributes and is written to take advantage of
>> the new behaviour will see significant performance improvements.
>>
>>
>> If you've got this far and forgotten the original question :-)
>>
>> Can someone point me to how to submit a new VM plugin?
>>
>>
>> Thanks,
>> Alistair
>>
>

--
www.tudorgirba.com
www.feenk.com

"Innovation comes in the least expected form.
That is, if it is expected, it already happened."


Reply | Threaded
Open this post in threaded view
|

Re: [Vm-dev] FileSystem file attributes and #isSymlink patch

Stephane Ducasse-3
In reply to this post by alistairgrant
Hi Alistair

Yes it should become part of the core of Pharo :).

Stef

On Tue, Jul 25, 2017 at 5:21 PM, Alistair Grant <[hidden email]> wrote:

> Hi David,
>
> Thanks very much for your follow-up.
>
> On Mon, Jul 24, 2017 at 07:28:07PM -0400, David T. Lewis wrote:
>>
>> Hi Alistair,
>>
>> I am copying this to vm-dev for follow up on the plugin, see below.
>>
>>
>> On Mon, Jul 24, 2017 at 08:09:10AM +0000, Alistair Grant wrote:
>> > Hi All,
>> >
>> > I'm nearly ready to submit a patch that started with the goal of being
>> > able to retrieve the device id and fixing FileReference>>isSymlink and
>> > also follows Esteban's suggestion of splitting out file existence and
>> > other attributes (which provides a performace gain).  See the summary
>> > below for a description of the changes.
>> >
>> > The patch involves adding a new VM plugin, FileAttriubutesPlugin.  To
>> > minimise the chance of any problems along the way I'd like to submit the
>> > patch in three steps:
>> >
>> > 1. Add the VM plugin (FileAttributesPlugin)
>> > 2. Add the code the image that allows testing of the code and plugin,
>> >    but doesn't do any integration with existing functionality.
>> >    This will allow the plugin to be tested by a few volunteers
>> >    (hopefully).
>> > 3. Add the patches that make the switch over to the new implementation.
>> >
>>
>> I think you are handling this in exactly the right way, kudos.
>>
>> >
>> > Can someone point me to how to submit a new VM plugin?  The code is
>> > contained in a subclass of InterpreterPlugin.
>>
>> Following up on the vm-dev list: If your plugin is available in a Monticello
>> repository, that would be great because VM builders can easily include it
>> and try it out. Any repository would be fine for starters, or if you have
>> an account on squeaksource.com I can add you as developer in the somewhat
>> loosely-related DirectoryPlugin project if that is of any help. Whatever is
>> convenient for you.
>
> smalltalkhub.com is probably easiest for me, thanks.
>
>
>> Your plugin sounds like something that would be stable and require little
>> maintenance over time, so it might make sense to pull it directly into
>> the VMMaker package. We can discuss that vm-dev list.
>
> I'm hoping this will become a core part of Pharo, so ultimately it
> should live with the other Pharo core plugins like FilePlugin (assuming
> it is accepted, of course).
>
>
>> Once your plugin code is available, it should be straighforward to start
>> including it in the various VM build configurations.
>
> Great, thanks.  build.linux32x86/pharo.cog.spur is probably the best
> place to start.
>
> I should be able to post the plugin to smalltalkhub.com quite
> quickly.  The smalltalk code will take me a while to repackage as I'll
> want to test it fairly carefully, and my time is very scattered (this is
> my hobby, but lots of family demands :-)).
>
>
> As a side effect, once this has been fully integrated it will be
> possible to get rid of at least some of the "#if Pharo" type
> conditionals in the FilePlugin, allowing the code to be tidied up.
>
> I'll post another reply once I've got the plugin code in
> smalltalkhub.com.
>
>
> Thanks again,
> Alistair
>
>
>
>> Dave
>>
>>
>> >
>> > I've been using this as my production environment for about 2 months now
>> > on a linux 32 bit VM.  Running the full test suite results in the same
>> > set of test failing before and after applying the patch.
>> >
>> > I've also ran file related tests on linux 64 bit (run the Test Runner,
>> > select all packages with "file" as part of the name and run all the
>> > available tests) and the full test suite on Windows 32 bit.
>> >
>> > The summary is:
>> >
>> > 1. #isSymlink now works properly on Linux (and it should also work on
>> >    MacOS and BSD).
>> > 2. The list of file attributes available from FileReference now is:
>> >     #accessTime (new)
>> >     #changeTime (new)
>> >     #creationTime
>> >     #deviceId (new)
>> >     #exists
>> >     #gid (new)
>> >     #inode (new)
>> >     #isBlock (new)
>> >     #isCharacter (new)
>> >     #isDirectory
>> >     #isExecutable (new)
>> >     #isFIFO (new)
>> >     #isFile
>> >     #isReadable
>> >     #isRegular (new)
>> >     #isSocket (new)
>> >     #isSymlink (works)
>> >     #isWritable
>> >     #modificationTime
>> >     #numberOfHardLinks (new)
>> >     #permissions
>> >     #size
>> >     #targetFile (new)
>> >     #uid (new)
>> > 3. FileReference>>exists is faster than before (well, at least on my
>> >    linux laptop).  This is useful as it is called quite often.
>> > 4. It is possible to retrieve symbolic link attributes, e.g. all the
>> >    attributes listed above plus the target path.
>> >
>> >
>> > Given how similar MacOS and BSD are to linux, I assume that this will
>> > all work without problem on those platforms (but it obviously needs to
>> > be tested).
>> >
>> > As implied above, the changes are all backward compatible (except
>> > the broken #isSymlink), although a couple deserve mention:
>> >
>> > 1. Obviously #isSymlink now answers correctly (previously it would only
>> >    answer correctly for a broken link).
>> > 2. Requesting any of the attributes listed above (except #isSymlink)
>> >    will return the value of the target path.  If the FileReference is to a
>> >    broken symbolic link, it will return the attributes of the symbolic
>> >    link (keeping existing behaviour).
>> > 3. The attributes of a symbolic link can be retrieved using
>> >    FileReference>>symlinkAttributes.
>> >
>> > Overall, performance is slightly better than before.  Code that
>> > needs to access multiple attributes and is written to take advantage of
>> > the new behaviour will see significant performance improvements.
>> >
>> >
>> > If you've got this far and forgotten the original question :-)
>> >
>> > Can someone point me to how to submit a new VM plugin?
>> >
>> >
>> > Thanks,
>> > Alistair
>

Reply | Threaded
Open this post in threaded view
|

Re: FileSystem file attributes and #isSymlink patch

alistairgrant
In reply to this post by alistairgrant
Hi All,

I've submitted the FileAttributesPlugin for inclusion in the VM (you can
follow on the vm-dev mailing list if interested).

As Cyril Ferlicot has been quite interested in the file system and its
performance, I thought I might make this available for early testing for
anyone that is interested.

As mentioned previously, the patch:

- Extends the public interface to the file system (FileReference)
  relating to file attributes and iterating over directories
- It doesn't make any changes to existing public behaviour (that I'm
  aware of).
- It does make significant changes to the internals (FileSystem,
  DiskStore and subclasses)
- Functionality relating to opening & closing files, file IO, etc. isn't
  touched

The limitation is that it is only available through the Ubuntu snap,
so you need one of:

- Ubuntu 14.04 or later (16.04 or later preferred)
- A container with Ubuntu 16.04 or later, e.g. Docker
- Can build your own 6.1 or 7.0 VM

If you:

sudo snap install pharo
mkdir fatest
cd fatest
sudo pharo.conf
pharo.getimage32
pharo.ui Pharo.image

And then follow the instructions at: https://github.com/akgrant43/FileAttributes
you'll have the new file attributes code.

Cheers,
Alistair



On Mon, Jul 24, 2017 at 08:09:10AM +0000, Alistair Grant wrote:

> Hi All,
>
> I'm nearly ready to submit a patch that started with the goal of being
> able to retrieve the device id and fixing FileReference>>isSymlink and
> also follows Esteban's suggestion of splitting out file existence and
> other attributes (which provides a performace gain).  See the summary
> below for a description of the changes.
>
> The patch involves adding a new VM plugin, FileAttriubutesPlugin.  To
> minimise the chance of any problems along the way I'd like to submit the
> patch in three steps:
>
> 1. Add the VM plugin (FileAttributesPlugin)
> 2. Add the code the image that allows testing of the code and plugin,
>    but doesn't do any integration with existing functionality.
>    This will allow the plugin to be tested by a few volunteers
>    (hopefully).
> 3. Add the patches that make the switch over to the new implementation.
>
>
> Can someone point me to how to submit a new VM plugin?  The code is
> contained in a subclass of InterpreterPlugin.
>
> I've been using this as my production environment for about 2 months now
> on a linux 32 bit VM.  Running the full test suite results in the same
> set of test failing before and after applying the patch.
>
> I've also ran file related tests on linux 64 bit (run the Test Runner,
> select all packages with "file" as part of the name and run all the
> available tests) and the full test suite on Windows 32 bit.
>
> The summary is:
>
> 1. #isSymlink now works properly on Linux (and it should also work on
>    MacOS and BSD).
> 2. The list of file attributes available from FileReference now is:
> #accessTime (new)
> #changeTime (new)
> #creationTime
> #deviceId (new)
> #exists
> #gid (new)
> #inode (new)
> #isBlock (new)
> #isCharacter (new)
> #isDirectory
> #isExecutable (new)
> #isFIFO (new)
> #isFile
> #isReadable
> #isRegular (new)
> #isSocket (new)
> #isSymlink (works)
> #isWritable
> #modificationTime
> #numberOfHardLinks (new)
> #permissions
> #size
> #targetFile (new)
> #uid (new)
> 3. FileReference>>exists is faster than before (well, at least on my
>    linux laptop).  This is useful as it is called quite often.
> 4. It is possible to retrieve symbolic link attributes, e.g. all the
>    attributes listed above plus the target path.
>
>
> Given how similar MacOS and BSD are to linux, I assume that this will
> all work without problem on those platforms (but it obviously needs to
> be tested).
>
> As implied above, the changes are all backward compatible (except
> the broken #isSymlink), although a couple deserve mention:
>
> 1. Obviously #isSymlink now answers correctly (previously it would only
>    answer correctly for a broken link).
> 2. Requesting any of the attributes listed above (except #isSymlink)
>    will return the value of the target path.  If the FileReference is to a
>    broken symbolic link, it will return the attributes of the symbolic
>    link (keeping existing behaviour).
> 3. The attributes of a symbolic link can be retrieved using
>    FileReference>>symlinkAttributes.
>
> Overall, performance is slightly better than before.  Code that
> needs to access multiple attributes and is written to take advantage of
> the new behaviour will see significant performance improvements.
>
>
> If you've got this far and forgotten the original question :-)
>
> Can someone point me to how to submit a new VM plugin?
>
>
> Thanks,
> Alistair