Hi All, I'm in the process of making FileReference>>moveTo: work across devices. To make the implementation a bit cleaner on Linux it would be nice to know whether the source and destination reside on the same disk filesystem. This can be determined by comparing stat.st_dev for the source file and destination directory. dir_EntryLookup() in sqUnixFile.c is exposed as primitiveDirectoryEntry and used by FilePluginPrims>>lookupDirectory:filename: It basically does a stat() on the supplied file and returns most of the resulting information. Unfortunately st_dev isn't returned. Would there be any objection to me extending the Pharo version of dir_EntryLookup() to include st_dev? Thanks, Alistair |
Hi Alistair, in the past we extended dir_EntryLookup because a lot of important information was missing. With time, I realised that maybe it would have been better to add an extra primitive to get extended properties. st_dev is not very used but I understand it can be good to have access to it. Then, my question now is: wouldn't be better to add an "extended properties” primitive, to answer st_dev and others? Esteban > On 21 Apr 2017, at 10:32, Alistair Grant <[hidden email]> wrote: > > > Hi All, > > I'm in the process of making FileReference>>moveTo: work across devices. > To make the implementation a bit cleaner on Linux it would be nice to > know whether the source and destination reside on the same disk > filesystem. This can be determined by comparing stat.st_dev for the > source file and destination directory. > > dir_EntryLookup() in sqUnixFile.c is exposed as primitiveDirectoryEntry > and used by FilePluginPrims>>lookupDirectory:filename: > > It basically does a stat() on the supplied file and returns most of the > resulting information. Unfortunately st_dev isn't returned. > > Would there be any objection to me extending the Pharo version of > dir_EntryLookup() to include st_dev? > > Thanks, > Alistair > |
2017-04-21 11:28 GMT+02:00 Esteban Lorenzano <[hidden email]>:
Hi Esteban, from maintenance POV, yes you are right. A new primitive is better.
|
In reply to this post by EstebanLM
Hi Esteban, Thanks for your reply. On Fri, Apr 21, 2017 at 11:28:09AM +0200, Esteban Lorenzano wrote: > > Hi Alistair, > > in the past we extended dir_EntryLookup because a lot of important > information was missing. With time, I realised that maybe it would > have been better to add an extra primitive to get extended properties. > st_dev is not very used but I understand it can be good to have access > to it. > > Then, my question now is: wouldn't be better to add an "extended > properties??? primitive, to answer st_dev and others? > > Esteban I'm not sure I'm knowledgable enough to answer this one. Which attributes do you consider core and which are extended? The only thing I can say at the moment is that dir_(Entry)Lookup really only does one thing, get the stat structure and return (most) of the information, and that st_dev is an integer, so it is quite lightweight to include. From that perspective (and not knowing much else), it doesn't seem to make much sense to split it in two. Unless you're saying you'd like to split it in to: - Does the entry exist (just return a boolean)? - Return information about the entry I guess that would depend on how often the first is called without the second soon(ish) after, and the additional cost of including the information in the first place. I'll try and have a look at the message flow leading to dir_Lookup and dir_EntryLookup. Cheers, Alistair > > On 21 Apr 2017, at 10:32, Alistair Grant <[hidden email]> wrote: > > > > > > Hi All, > > > > I'm in the process of making FileReference>>moveTo: work across devices. > > To make the implementation a bit cleaner on Linux it would be nice to > > know whether the source and destination reside on the same disk > > filesystem. This can be determined by comparing stat.st_dev for the > > source file and destination directory. > > > > dir_EntryLookup() in sqUnixFile.c is exposed as primitiveDirectoryEntry > > and used by FilePluginPrims>>lookupDirectory:filename: > > > > It basically does a stat() on the supplied file and returns most of the > > resulting information. Unfortunately st_dev isn't returned. > > > > Would there be any objection to me extending the Pharo version of > > dir_EntryLookup() to include st_dev? > > > > Thanks, > > Alistair > > > |
In reply to this post by EstebanLM
On Fri, Apr 21, 2017 at 11:28:09AM +0200, Esteban Lorenzano wrote: > > Hi Alistair, > > in the past we extended dir_EntryLookup because a lot of important information was missing. With time, I realised that maybe it would have been better to add an extra primitive to get extended properties. > st_dev is not very used but I understand it can be good to have access to it. > > Then, my question now is: wouldn't be better to add an "extended properties??? primitive, to answer st_dev and others? > +1 I think that is a much better approach. Dave > Esteban > > > On 21 Apr 2017, at 10:32, Alistair Grant <[hidden email]> wrote: > > > > > > Hi All, > > > > I'm in the process of making FileReference>>moveTo: work across devices. > > To make the implementation a bit cleaner on Linux it would be nice to > > know whether the source and destination reside on the same disk > > filesystem. This can be determined by comparing stat.st_dev for the > > source file and destination directory. > > > > dir_EntryLookup() in sqUnixFile.c is exposed as primitiveDirectoryEntry > > and used by FilePluginPrims>>lookupDirectory:filename: > > > > It basically does a stat() on the supplied file and returns most of the > > resulting information. Unfortunately st_dev isn't returned. > > > > Would there be any objection to me extending the Pharo version of > > dir_EntryLookup() to include st_dev? > > > > Thanks, > > Alistair > > > |
On Fri, Apr 21, 2017 at 08:13:40AM -0400, David T. Lewis wrote: > > On Fri, Apr 21, 2017 at 11:28:09AM +0200, Esteban Lorenzano wrote: > > > > Hi Alistair, > > > > in the past we extended dir_EntryLookup because a lot of important > > information was missing. With time, I realised that maybe it would > > have been better to add an extra primitive to get extended > > properties. st_dev is not very used but I understand it can be good > > to have access to it. > > > > Then, my question now is: wouldn't be better to add an "extended > > properties??? primitive, to answer st_dev and others? > > > > +1 > > I think that is a much better approach. There appears to be unanimous agreement to refactor this. The way the primitives are currently used is that there is a method for each attribute, plus the basic existence. All of them retrieve the entire collection of information and pick out the piece they need. I'm not sure I understand Esteban's suggestion of creating an "extended properties" primitive, however my suggestion is to add dir_(Entry)LookupAttribute primitives which take an additional argument: the index of the attribute to be returned: 0 File exists 1 File name 2 st_mode 3 st_ino 4 st_dev 5 st_nlink 6 st_uid 7 st_gid 8 st_size 9 st_atime 10 st_mtime 11 st_ctime 12 st_blocks 13 st_blksize I'll then refactor the FileSystemStore hierarchy to use the new primitives. I can't modify the Windows code, so it will simulate the new primitive while using the old. Once Windows has the new primitive implemented, the old will no longer be called from within the image and, barring any backward compatibility requirements, could be removed. Am I missing anything? Thanks, Alistair |
> On 21-04-2017, at 10:17 AM, Alistair Grant <[hidden email]> wrote: > > > Am I missing anything? I think so; I urge you to consider working with Dave Lewis to see if it might make sense to improve his DirectoryPlugin. http://www.squeaksource.com/DirectoryPlugin.html http://wiki.squeak.org/squeak/2274 More generally the file stuff is quite a convoluted mess. Any concerted effort to clean it up, improve performance and error handling and even (gasp!) document where it does well or poorly, would be welcomed. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim If you don't pay the exorcist do you get repossessed? |
On Fri, Apr 21, 2017 at 10:37 AM, tim Rowledge <[hidden email]> wrote:
+1. Further, primitive invocation is slow. Try and provide a bulk primitive that answers multiple attributes, especially if the attributes are obtained from a single system call (as is the case with stat). Further, try and come up with cross-platform abstractions so that a single primitive can be used across platforms. One of the things that would require a lot of thought is harmonising Unix symbolic links, Mac OS X Aliases and Windows Aliases.
_,,,^..^,,,_ best, Eliot |
On Fri, Apr 21, 2017 at 11:36:52AM -0700, Eliot Miranda wrote: > On Fri, Apr 21, 2017 at 10:37 AM, tim Rowledge <[hidden email]> wrote: > > On 21-04-2017, at 10:17 AM, Alistair Grant <[hidden email]> wrote: > > > > > > Am I missing anything? > > I think so; I urge you to consider working with Dave Lewis to see if it > might make sense to improve his DirectoryPlugin. > > +1. Sure. I don't know anything about the DirectoryPlugin, and it looks like it isn't part of Pharo, but I'm happy to help. It would be particularly good if David can help with the Windows side of things. David? > Further, primitive invocation is slow. Try and provide a bulk primitive that > answers multiple attributes, especially if the attributes are obtained from a > single system call (as is the case with stat). My current thoughts, which may change after David's input and any other feedback... In Pharo the majority of calls are just to check file existance. The remaining calls typically use only one element of all the information returned (in the base Pharo image, I can't speak for applications, of course). That was why I proposed extending the primitive to indicate which piece of information to return. Following on from Eliots comments about returning multiple attributes: The current methods in Pharo test / return one piece of information at a time. I don't think we can make assumptions about how long the attribute information is valid, so we can't cache the information. That means we would create an attribute object that is returned to the caller and leave it up to them to manage how long to consider it valid. > Further, try and come up with cross-platform abstractions so that a single > primitive can be used across platforms. One of the things that would require a > lot of thought is harmonising Unix symbolic links, Mac OS X Aliases and Windows > Aliases. Hopefully the file attribute object I mentioned above could manage the cross-platform interpretation, leaving the primitives to just return the basic information. > http://www.squeaksource.com/DirectoryPlugin.html > http://wiki.squeak.org/squeak/2274 > > More generally the file stuff is quite a convoluted mess. Any concerted > effort to clean it up, improve performance and error handling and even > (gasp!) document where it does well or poorly, would be welcomed. While I was profiling this I noticed that DiskStore>>defaultWorkingDirectory is the most called method by a factor of about 3, as it is called every time a filename is resolved. #defaultWorkingDirectory is relatively expensive as it calls a primitive to get the image directory, converts the resulting ByteString to a byte array, calls ZnCharacterEncoder to decode it, converts the string to a path, and finally gets the parent. As an example of how often it can be called: Using Iceberg to clone an empty repository, add a small package and synchronise the repository called #defaultWorkingDirectory around 500 times (+/- 50). It is called so many times because FileSystem>>resolve: leads to FileSystem>>resolvePath: which resolves the supplied path against the working directory, and #resolve: is called from many places. Caching the result in memory using lazy initialisation gives over a 1,000 times performance improvement in #defaultWorkingDirectory (assuming I tested it correctly, I'll supply more details if we look in to this). The tricky part is managing the cache during image Save As and quit / restart. Cheers, Alistair # vim: tw=72 |
On Sun, Apr 23, 2017 at 04:56:29PM +0000, Alistair Grant wrote: > > On Fri, Apr 21, 2017 at 11:36:52AM -0700, Eliot Miranda wrote: > > On Fri, Apr 21, 2017 at 10:37 AM, tim Rowledge <[hidden email]> wrote: > > > On 21-04-2017, at 10:17 AM, Alistair Grant <[hidden email]> wrote: > > > > > > > > > Am I missing anything? > > > > I think so; I urge you to consider working with Dave Lewis to see if it > > might make sense to improve his DirectoryPlugin. > > > > +1. > > Sure. I don't know anything about the DirectoryPlugin, and it looks > like it isn't part of Pharo, but I'm happy to help. It would be > particularly good if David can help with the Windows side of things. > > David? Hi Alistair, Sure, happy to help. You can find the VM plugin at http://www.squeaksource.com/DirectoryPlugin. A small plugin is actually fairly simple to do, so take a look at the DirectoryPlugin as an example, and we can either add the functionality you want, or just use it as a pattern to make a plugin that does exactly what you need. I have been thinking of splitting DirectoryPlugin into two smaller pieces, one of which would be called PosixFileStatPlugin. If we do that, we would have a small plugin that addresses only the Posix file stat functions, and we could fill it out to have it answer the fields that you are interested in (st_dev). Regarding Windows, I am not currently in a position to do Windows plugin development, but I do have an idea of the issues involved and I'm happy to help where I can. Note that Windows and Unix have different models to represent "files" and "directories" so there are some actual semantic differences on the different platforms. The Unix/Posix stat function comes from the world of Unix, so some of the concepts may not completely line up with the Windows model. The best references I have found for this come from the Microsoft technical documentation (I do not have a link right now, but it is all available on the web). Dave |
In reply to this post by alistairgrant
On Sunday 23 April 2017 10:26 PM, Alistair Grant wrote: > While I was profiling this I noticed that > DiskStore>>defaultWorkingDirectory is the most called method by a factor > of about 3, as it is called every time a filename is resolved. > > #defaultWorkingDirectory is relatively expensive as it calls a primitive > to get the image directory, converts the resulting ByteString to a byte > array, calls ZnCharacterEncoder to decode it, converts the string to a > path, and finally gets the parent. Directory paths like these are really part of the host environment. Is there any need to reify them within an image? As long as we use only relative paths, then paths like VM, Plugin, Image, current directory etc. can be encoded using special prefixes (say "$vm" or "~vm" or "vm:") in persistent image variables and expanded into full paths by VM (through plugins) at run time using environment variables or command line options. The plugin can cache these nodes to improve performance. Regards .. Subbu |
On Mon, Apr 24, 2017 at 04:03:42PM +0530, K K Subbu wrote: > On Sunday 23 April 2017 10:26 PM, Alistair Grant wrote: > >While I was profiling this I noticed that > >DiskStore>>defaultWorkingDirectory is the most called method by a factor > >of about 3, as it is called every time a filename is resolved. > > > >#defaultWorkingDirectory is relatively expensive as it calls a primitive > >to get the image directory, converts the resulting ByteString to a byte > >array, calls ZnCharacterEncoder to decode it, converts the string to a > >path, and finally gets the parent. > > Directory paths like these are really part of the host environment. Is there > any need to reify them within an image? Pharo uses the image directory as its current working directory. If you supply a relative path, it is resolved against the image directory. I find this a bit unintuitive, I'd expect the defaultWorkingDirectory to be the same as the parent process, e.g. the result of running pwd in a shell. But changing it now would probably break a lot of code. > As long as we use only relative > paths, then paths like VM, Plugin, Image, current directory etc. can be > encoded using special prefixes (say "$vm" or "~vm" or "vm:") in persistent > image variables and expanded into full paths by VM (through plugins) at run > time using environment variables or command line options. FileLocator provides access to all these locations. I'm in two minds about providing string substitution. > The plugin can > cache these nodes to improve performance. This is part of what I'm looking at. Cheers, Alistair |
In reply to this post by David T. Lewis
Hi All, I'm currently building the VM from the pharo-vm github repository using: pharo-vm/opensmalltalk-vm/build.linux64x64/pharo.cog.spur/build/mvm (or build.linux32x86) Can someone kindly point me to how to build the Pharo 6 VM using VMMaker? Hopefully I can then figure out how to incorporate Dave's DirectoryPlugin, and then figure out the path forward. Thanks, Alistair On Sun, Apr 23, 2017 at 07:10:10PM -0400, David T. Lewis wrote: > > On Sun, Apr 23, 2017 at 04:56:29PM +0000, Alistair Grant wrote: > > > > On Fri, Apr 21, 2017 at 11:36:52AM -0700, Eliot Miranda wrote: > > > On Fri, Apr 21, 2017 at 10:37 AM, tim Rowledge <[hidden email]> wrote: > > > > On 21-04-2017, at 10:17 AM, Alistair Grant <[hidden email]> wrote: > > > > > > > > > > > > Am I missing anything? > > > > > > I think so; I urge you to consider working with Dave Lewis to see if it > > > might make sense to improve his DirectoryPlugin. > > > > > > +1. > > > > Sure. I don't know anything about the DirectoryPlugin, and it looks > > like it isn't part of Pharo, but I'm happy to help. It would be > > particularly good if David can help with the Windows side of things. > > > > David? > > > Hi Alistair, > > Sure, happy to help. You can find the VM plugin at http://www.squeaksource.com/DirectoryPlugin. > > A small plugin is actually fairly simple to do, so take a look at the > DirectoryPlugin as an example, and we can either add the functionality > you want, or just use it as a pattern to make a plugin that does exactly > what you need. > > I have been thinking of splitting DirectoryPlugin into two smaller pieces, > one of which would be called PosixFileStatPlugin. If we do that, we would > have a small plugin that addresses only the Posix file stat functions, and > we could fill it out to have it answer the fields that you are interested > in (st_dev). > > Regarding Windows, I am not currently in a position to do Windows plugin > development, but I do have an idea of the issues involved and I'm happy > to help where I can. Note that Windows and Unix have different models to > represent "files" and "directories" so there are some actual semantic > differences on the different platforms. The Unix/Posix stat function comes > from the world of Unix, so some of the concepts may not completely line > up with the Windows model. The best references I have found for this come > from the Microsoft technical documentation (I do not have a link right now, > but it is all available on the web). > > Dave > |
In reply to this post by alistairgrant
You might need to check your assumptions here. Different OSs have different ideas about what - or even *if* - a current working directory means. You also need to consider the *intent* of such paths. Setting a default to the directory in which the image was found may or may not make sense; what if it is a default image installed in some place that is not intended for general user storage? What if it is reached via an alias so any attempt to find the parent directory results in a very strange seeming result? Caching such things can be fraught with pain too. Is it plausible that the actual working directory you thought was at wibble/foo/gerbil/cardboard-tube has been reset since you found that? What if the user or some administration process moves things around whilst you are running? tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Useful random insult:- Mind like a steel sieve. |
In reply to this post by alistairgrant
On 4/24/2017 11:26 AM, Alistair Grant wrote: > ... > Pharo uses the image directory as its current working directory. If you > supply a relative path, it is resolved against the image directory. > > I find this a bit unintuitive, I'd expect the defaultWorkingDirectory to > be the same as the parent process, e.g. the result of running pwd in a > shell. But changing it now would probably break a lot of code. In Cuis I did as you say. For instance, juan@juani5:~$ pwd /home/juan Starting the image like: juan@juani5:~$ Rectifier/cogspur64/squeak Rectifier/Cuis-Smalltalk-Dev/Cuis5.0-3076-spur-64.image And then, in Smalltalk, './' asDirectoryEntry evaluates to /home/juan This is the comment for #currentDirectory "Answer the current directory. In Unix it is the current directory in the OS shell that started us. In Windows the same happens if the image file is in a subree of the Windows current directory." For additional details, check in Cuis. This has proved to be extremely useful for writing command line applications that integrate nicely with Bash, for instance an orthorectifier for satellite images. Cheers, -- Juan Vuletich www.cuis-smalltalk.org https://github.com/Cuis-Smalltalk/Cuis-Smalltalk-Dev @JuanVuletich |
In reply to this post by timrowledge
Hi Tim, On Mon, Apr 24, 2017 at 09:46:59AM -0700, tim Rowledge wrote: > > You might need to check your assumptions here. Different OSs have > different ideas about what - or even *if* - a current working > directory means. > > You also need to consider the *intent* of such paths. Setting a > default to the directory in which the image was found may or may not > make sense; what if it is a default image installed in some place that > is not intended for general user storage? What if it is reached via > an alias so any attempt to find the parent directory results in a very > strange seeming result? > > Caching such things can be fraught with pain too. Is it plausible that > the actual working directory you thought was at > wibble/foo/gerbil/cardboard-tube has been reset since you found that? > What if the user or some administration process moves things around > whilst you are running? Yep. Anyway, changing the definition now would likely break lots of code. I might look at adding a method that returns the parent process working directory (if one exists in Pharo, I haven't found it). Thanks, Alistair |
In reply to this post by alistairgrant
On Mon, Apr 24, 2017 at 02:57:32PM +0000, Alistair Grant wrote: > Hi All, > > I'm currently building the VM from the pharo-vm github repository using: > > pharo-vm/opensmalltalk-vm/build.linux64x64/pharo.cog.spur/build/mvm > > (or build.linux32x86) > > Can someone kindly point me to how to build the Pharo 6 VM using > VMMaker? Never mind, I've finally figured it out. In case anyone stumbles across this, the shortcut instructions I have for Pharo are (for the pre-release Pharo 6.0 VM): 1. Read Eliot's blog: http://www.mirandabanda.org/cogblog/compiling-the-vm/ 2. The HowToBuild file is at: 32 bit: oscogvm/build.linux32x86/HowToBuild 64 bit: oscogvm/build.linux64x64/HowToBuild 3. mkdir oscogvm/sources && cp PharoV50.sources oscogvm/sources 4. Run: oscogvm/scripts/updateSCCSVersions 5. For Pharo, the build directory is: 32 bit: oscogvm/build.linux32x86/pharo.cog.spur/build 64 bit: oscogvm/build.linux64x64/pharo.cog.spur/build If you want to build custom plugins, download the latest stable Squeak VM and image and run VMMaker from there following the instructions at http://wiki.squeak.org/squeak/2444 Then copy the plugin(s) source in to: oscogvm/platforms/unix/plugins/ and add the plugin(s) to either: oscogvm/build.linux32x86/pharo.cog.spur/plugins.int oscogvm/build.linux32x86/pharo.cog.spur/plugins.ext Then follow the build instructions above. Cheers, Alistair |
In reply to this post by David T. Lewis
On Sun, Apr 23, 2017 at 07:10:10PM -0400, David T. Lewis wrote: > > Sure, happy to help. You can find the VM plugin at > http://www.squeaksource.com/DirectoryPlugin. > > A small plugin is actually fairly simple to do, so take a look at the > DirectoryPlugin as an example, and we can either add the functionality > you want, or just use it as a pattern to make a plugin that does > exactly what you need. > > I have been thinking of splitting DirectoryPlugin into two smaller > pieces, one of which would be called PosixFileStatPlugin. If we do > that, we would have a small plugin that addresses only the Posix file > stat functions, and we could fill it out to have it answer the fields > that you are interested in (st_dev). I finally have a Pharo VM with your DirectoryPlugin available and working, so I think I'm ready to proceed... :-) From what I can see, at the moment: - DirectoryPlugin isn't a core part of Squeak, i.e. if I just download a VM and image from the website it won't be in the image or VM. - The version of FilePlugin used in Pharo is different to that in Squeak (even though there are the Pharo #ifdefs in the Squeak version). The Pharo version is at, e.g. (there are multiple copies, I haven't looked to see if they are different from each other: https://github.com/pharo-project/pharo-vm/tree/master/mc/VMMaker.oscog.package/FilePlugin.class E.g. the Pharo version has added primitiveDirectoryEntry Assuming my understanding is correct, and based on comments from Tim and Eliot, does that mean that there's push to make DirectoryEntry a core part of Squeak? Given that DirectoryEntry isn't part of Pharo, and the added complexity of the forked FilePlugin, I agree that creating a new PosixFileStatPlugin sounds like a good idea. To meet what Pharo immediately needs, PosixFileStatPlugin would initially contain the following primitives: - primitiveFileExists - answers a boolean indicating whether the file exists. - primitiveFileAccess - answers a mask indicating what access the VM has to the file, i.e. read, write, execute, ??? - primitiveStatFile - answers an array which contains all the data returned by the libc stat() function. The reason for these three is that checking for file existence in Pharo is much more common than any other operation, so we want to make this as quick as possible. After that, code in the base image looks at only one attribute at a time, however based on Eliot's comments, it sounds like the primitive call overhead is much larger than the object creation and garbage collection overhead, so just return a single collection of all information. The smalltalk code can then be modified to allow the user to access the attributes as required. For Pharo at least, I'd probably add a FileAttributes hierarchy of classes that can be used to interpret the information and deal with cross-platform differences: FileAttributes UnixFileAttributes MacFileAttributes WindowsFileAttributes ... Please let me know what you'd do differently. Thanks, Alistair |
Free forum by Nabble | Edit this page |