The efficiency of FileList>>listForPattern: is something which should
deserve a bit more care. It tries to sort the files, for example by name, but wants to display directory first ^ [ :x :y | |xIsDir| ((xIsDir := x isDirectory) = y isDirectory) ifTrue: [ x basename <= y basename ] ifFalse: [ "directories always precede files" xIsDir ]] Alas, this isDirectory test cost you an arm: FileReference>>isDirectory ^ filesystem isDirectory: path FileSystem>>isDirectory: aResolvable "Resolve the argument, and answer true if the result refers to a directory, false if it refers to a file or doesn't exist." ^ store isDirectory: (self resolve: aResolvable) FileSystemStore>>isDirectory: aPath aPath isRoot ifTrue: [ ^ true ]. self nodeAt: aPath ifPresent: [ :entry | ^ self basicIsDirectory: entry ] ifAbsent: [ ^ false ]. DiskStore>>nodeAt: aPath ifPresent: presentBlock ifAbsent: absentBlock | name| aPath isRoot ifTrue: [ ^ presentBlock value: self rootNode ]. "| encodedPath encodedBasename entry | encodedPath := Primitives encode: (self stringFromPath: aPath parent). encodedBasename := Primitives encode: aPath basename. entry := Primitives lookupDirectory: encodedPath filename: encodedBasename. ^ entry == #badDirectoryPath ifTrue: absentBlock ifFalse: [ entry at: 1 put: aPath basename. presentBlock value: entry ]." name := aPath basename. self directoryAt: aPath parent ifAbsent: absentBlock nodesDo: [ :entry | (self filename: (entry at: 1) matches: name) ifTrue: [ ^ presentBlock value: entry ] ]. ^ absentBlock value Arghh, it scans the whole parent directory again! If sort is O(n log n), then we transform it into O(2 n^2 log n). Try to browse your package-cache if you're not a chicken. Seriously, it's unusable... Nicolas |
This is one of those things that I find shocking. Squeak has been around for 15+ years, and still has no efficient way to get files matching a wildcard pattern (I sure couldn't find it), FFI does not (easily) support callbacks, etc.
________________________________________ From: [hidden email] [[hidden email]] on behalf of Nicolas Cellier [[hidden email]] Sent: Wednesday, May 09, 2012 6:18 PM To: Pharo Development Subject: [Pharo-project] FileList efficiency could hardly be worse The efficiency of FileList>>listForPattern: is something which should deserve a bit more care. It tries to sort the files, for example by name, but wants to display directory first ^ [ :x :y | |xIsDir| ((xIsDir := x isDirectory) = y isDirectory) ifTrue: [ x basename <= y basename ] ifFalse: [ "directories always precede files" xIsDir ]] Alas, this isDirectory test cost you an arm: FileReference>>isDirectory ^ filesystem isDirectory: path FileSystem>>isDirectory: aResolvable "Resolve the argument, and answer true if the result refers to a directory, false if it refers to a file or doesn't exist." ^ store isDirectory: (self resolve: aResolvable) FileSystemStore>>isDirectory: aPath aPath isRoot ifTrue: [ ^ true ]. self nodeAt: aPath ifPresent: [ :entry | ^ self basicIsDirectory: entry ] ifAbsent: [ ^ false ]. DiskStore>>nodeAt: aPath ifPresent: presentBlock ifAbsent: absentBlock | name| aPath isRoot ifTrue: [ ^ presentBlock value: self rootNode ]. "| encodedPath encodedBasename entry | encodedPath := Primitives encode: (self stringFromPath: aPath parent). encodedBasename := Primitives encode: aPath basename. entry := Primitives lookupDirectory: encodedPath filename: encodedBasename. ^ entry == #badDirectoryPath ifTrue: absentBlock ifFalse: [ entry at: 1 put: aPath basename. presentBlock value: entry ]." name := aPath basename. self directoryAt: aPath parent ifAbsent: absentBlock nodesDo: [ :entry | (self filename: (entry at: 1) matches: name) ifTrue: [ ^ presentBlock value: entry ] ]. ^ absentBlock value Arghh, it scans the whole parent directory again! If sort is O(n log n), then we transform it into O(2 n^2 log n). Try to browse your package-cache if you're not a chicken. Seriously, it's unusable... Nicolas |
In reply to this post by Nicolas Cellier
:)
indeed. Imagine my 10 GB package cache. Stef On May 10, 2012, at 12:18 AM, Nicolas Cellier wrote: > The efficiency of FileList>>listForPattern: is something which should > deserve a bit more care. > > It tries to sort the files, for example by name, but wants to display > directory first > ^ [ :x :y | |xIsDir| > ((xIsDir := x isDirectory) = y isDirectory) > ifTrue: [ x basename <= y basename ] > ifFalse: [ > "directories always precede files" > xIsDir ]] > > Alas, this isDirectory test cost you an arm: > > FileReference>>isDirectory > ^ filesystem isDirectory: path > > FileSystem>>isDirectory: aResolvable > "Resolve the argument, and answer true if the result refers > to a directory, false if it refers to a file or doesn't exist." > > ^ store isDirectory: (self resolve: aResolvable) > > FileSystemStore>>isDirectory: aPath > aPath isRoot ifTrue: [ ^ true ]. > self > nodeAt: aPath > ifPresent: [ :entry | ^ self basicIsDirectory: entry ] > ifAbsent: [ ^ false ]. > > DiskStore>>nodeAt: aPath ifPresent: presentBlock ifAbsent: absentBlock > | name| > aPath isRoot ifTrue: [ ^ presentBlock value: self rootNode ]. > "| encodedPath encodedBasename entry | > encodedPath := Primitives encode: (self stringFromPath: aPath parent). > encodedBasename := Primitives encode: aPath basename. > entry := Primitives lookupDirectory: encodedPath filename: encodedBasename. > ^ entry == #badDirectoryPath > ifTrue: absentBlock > ifFalse: [ > entry at: 1 put: aPath basename. > presentBlock value: entry ]." > name := aPath basename. > self > directoryAt: aPath parent > ifAbsent: absentBlock > nodesDo: > [ :entry | > (self filename: (entry at: 1) matches: name) > ifTrue: [ ^ presentBlock value: entry ] ]. > ^ absentBlock value > > Arghh, it scans the whole parent directory again! > If sort is O(n log n), then we transform it into O(2 n^2 log n). > > Try to browse your package-cache if you're not a chicken. > Seriously, it's unusable... > > Nicolas > |
Thanks Nicolas. I would appreciate any effort in this directory since I also have a 10gb package-cache and yes, I know it is slow.
If you can provive a slice :) On Thu, May 10, 2012 at 9:44 AM, Stéphane Ducasse <[hidden email]> wrote: :) -- Mariano http://marianopeck.wordpress.com |
:/ sadly this is not new...
see http://code.google.com/p/pharo/issues/detail?id=5316 and http://code.google.com/p/cog/issues/detail?id=73 and I mentioned it in an earlier thread. if you want you can write a fix for this by using the other file primitives (I don't remember the exact name right now) where you access the entries by name instead of by index, which of course avoids looping internally again. I wrote a fix for this under OSX where they weren't implemented... so basically it should work right ahead! btw this implementation is super fun!! Filelist loops => n DiskStore loops => n*(n/2) "in average you loop until you find the proper entry) Primitive loops => n*(n/2)*n "again since you access the stuff by index" best cami On 2012-05-10, at 09:56, Mariano Martinez Peck wrote: > Thanks Nicolas. I would appreciate any effort in this directory since I also have a 10gb package-cache and yes, I know it is slow. > If you can provive a slice :) > > On Thu, May 10, 2012 at 9:44 AM, Stéphane Ducasse <[hidden email]> wrote: > :) > > indeed. Imagine my 10 GB package cache. > > Stef > > > On May 10, 2012, at 12:18 AM, Nicolas Cellier wrote: > > > The efficiency of FileList>>listForPattern: is something which should > > deserve a bit more care. > > > > It tries to sort the files, for example by name, but wants to display > > directory first > > ^ [ :x :y | |xIsDir| > > ((xIsDir := x isDirectory) = y isDirectory) > > ifTrue: [ x basename <= y basename ] > > ifFalse: [ > > "directories always precede files" > > xIsDir ]] > > > > Alas, this isDirectory test cost you an arm: > > > > FileReference>>isDirectory > > ^ filesystem isDirectory: path > > > > FileSystem>>isDirectory: aResolvable > > "Resolve the argument, and answer true if the result refers > > to a directory, false if it refers to a file or doesn't exist." > > > > ^ store isDirectory: (self resolve: aResolvable) > > > > FileSystemStore>>isDirectory: aPath > > aPath isRoot ifTrue: [ ^ true ]. > > self > > nodeAt: aPath > > ifPresent: [ :entry | ^ self basicIsDirectory: entry ] > > ifAbsent: [ ^ false ]. > > > > DiskStore>>nodeAt: aPath ifPresent: presentBlock ifAbsent: absentBlock > > | name| > > aPath isRoot ifTrue: [ ^ presentBlock value: self rootNode ]. > > "| encodedPath encodedBasename entry | > > encodedPath := Primitives encode: (self stringFromPath: aPath parent). > > encodedBasename := Primitives encode: aPath basename. > > entry := Primitives lookupDirectory: encodedPath filename: encodedBasename. > > ^ entry == #badDirectoryPath > > ifTrue: absentBlock > > ifFalse: [ > > entry at: 1 put: aPath basename. > > presentBlock value: entry ]." > > name := aPath basename. > > self > > directoryAt: aPath parent > > ifAbsent: absentBlock > > nodesDo: > > [ :entry | > > (self filename: (entry at: 1) matches: name) > > ifTrue: [ ^ presentBlock value: entry ] ]. > > ^ absentBlock value > > > > Arghh, it scans the whole parent directory again! > > If sort is O(n log n), then we transform it into O(2 n^2 log n). > > > > Try to browse your package-cache if you're not a chicken. > > Seriously, it's unusable... > > > > Nicolas > > > > > > > > -- > Mariano > http://marianopeck.wordpress.com > |
In reply to this post by Schwab,Wilhelm K
Please excuse my clarification but... this is a Pharo issue not a
Squeak issue. Squeak is fine and always has been. Time to display my package-cache directory in Squeak 4.3: < 1 second Time to display my package-cache directory in Pharo 1.4: > 5 minutes and still waiting before I give up and kill -9.. On Wed, May 9, 2012 at 5:33 PM, Schwab,Wilhelm K <[hidden email]> wrote: > This is one of those things that I find shocking. Squeak has been around for 15+ years, and still has no efficient way to get files matching a wildcard pattern (I sure couldn't find it), FFI does not (easily) support callbacks, etc. > > > > > ________________________________________ > From: [hidden email] [[hidden email]] on behalf of Nicolas Cellier [[hidden email]] > Sent: Wednesday, May 09, 2012 6:18 PM > To: Pharo Development > Subject: [Pharo-project] FileList efficiency could hardly be worse > > The efficiency of FileList>>listForPattern: is something which should > deserve a bit more care. > > It tries to sort the files, for example by name, but wants to display > directory first > ^ [ :x :y | |xIsDir| > ((xIsDir := x isDirectory) = y isDirectory) > ifTrue: [ x basename <= y basename ] > ifFalse: [ > "directories always precede files" > xIsDir ]] > > Alas, this isDirectory test cost you an arm: > > FileReference>>isDirectory > ^ filesystem isDirectory: path > > FileSystem>>isDirectory: aResolvable > "Resolve the argument, and answer true if the result refers > to a directory, false if it refers to a file or doesn't exist." > > ^ store isDirectory: (self resolve: aResolvable) > > FileSystemStore>>isDirectory: aPath > aPath isRoot ifTrue: [ ^ true ]. > self > nodeAt: aPath > ifPresent: [ :entry | ^ self basicIsDirectory: entry ] > ifAbsent: [ ^ false ]. > > DiskStore>>nodeAt: aPath ifPresent: presentBlock ifAbsent: absentBlock > | name| > aPath isRoot ifTrue: [ ^ presentBlock value: self rootNode ]. > "| encodedPath encodedBasename entry | > encodedPath := Primitives encode: (self stringFromPath: aPath parent). > encodedBasename := Primitives encode: aPath basename. > entry := Primitives lookupDirectory: encodedPath filename: encodedBasename. > ^ entry == #badDirectoryPath > ifTrue: absentBlock > ifFalse: [ > entry at: 1 put: aPath basename. > presentBlock value: entry ]." > name := aPath basename. > self > directoryAt: aPath parent > ifAbsent: absentBlock > nodesDo: > [ :entry | > (self filename: (entry at: 1) matches: name) > ifTrue: [ ^ presentBlock value: entry ] ]. > ^ absentBlock value > > Arghh, it scans the whole parent directory again! > If sort is O(n log n), then we transform it into O(2 n^2 log n). > > Try to browse your package-cache if you're not a chicken. > Seriously, it's unusable... > > Nicolas > > |
I just checked:
- Using the old FileDirectory has the advantage that we keep the low level entries directory => O(n^n) - FIleSystem doesn't do that ;) => O(n^3) we should really switch to the other, direct access primitive! On 2012-05-11, at 03:20, Chris Muller wrote: > Please excuse my clarification but... this is a Pharo issue not a > Squeak issue. Squeak is fine and always has been. > > Time to display my package-cache directory in Squeak 4.3: < 1 second > > Time to display my package-cache directory in Pharo 1.4: > 5 minutes > and still waiting before I give up and kill -9.. > > > > On Wed, May 9, 2012 at 5:33 PM, Schwab,Wilhelm K <[hidden email]> wrote: >> This is one of those things that I find shocking. Squeak has been around for 15+ years, and still has no efficient way to get files matching a wildcard pattern (I sure couldn't find it), FFI does not (easily) support callbacks, etc. >> >> >> >> >> ________________________________________ >> From: [hidden email] [[hidden email]] on behalf of Nicolas Cellier [[hidden email]] >> Sent: Wednesday, May 09, 2012 6:18 PM >> To: Pharo Development >> Subject: [Pharo-project] FileList efficiency could hardly be worse >> >> The efficiency of FileList>>listForPattern: is something which should >> deserve a bit more care. >> >> It tries to sort the files, for example by name, but wants to display >> directory first >> ^ [ :x :y | |xIsDir| >> ((xIsDir := x isDirectory) = y isDirectory) >> ifTrue: [ x basename <= y basename ] >> ifFalse: [ >> "directories always precede files" >> xIsDir ]] >> >> Alas, this isDirectory test cost you an arm: >> >> FileReference>>isDirectory >> ^ filesystem isDirectory: path >> >> FileSystem>>isDirectory: aResolvable >> "Resolve the argument, and answer true if the result refers >> to a directory, false if it refers to a file or doesn't exist." >> >> ^ store isDirectory: (self resolve: aResolvable) >> >> FileSystemStore>>isDirectory: aPath >> aPath isRoot ifTrue: [ ^ true ]. >> self >> nodeAt: aPath >> ifPresent: [ :entry | ^ self basicIsDirectory: entry ] >> ifAbsent: [ ^ false ]. >> >> DiskStore>>nodeAt: aPath ifPresent: presentBlock ifAbsent: absentBlock >> | name| >> aPath isRoot ifTrue: [ ^ presentBlock value: self rootNode ]. >> "| encodedPath encodedBasename entry | >> encodedPath := Primitives encode: (self stringFromPath: aPath parent). >> encodedBasename := Primitives encode: aPath basename. >> entry := Primitives lookupDirectory: encodedPath filename: encodedBasename. >> ^ entry == #badDirectoryPath >> ifTrue: absentBlock >> ifFalse: [ >> entry at: 1 put: aPath basename. >> presentBlock value: entry ]." >> name := aPath basename. >> self >> directoryAt: aPath parent >> ifAbsent: absentBlock >> nodesDo: >> [ :entry | >> (self filename: (entry at: 1) matches: name) >> ifTrue: [ ^ presentBlock value: entry ] ]. >> ^ absentBlock value >> >> Arghh, it scans the whole parent directory again! >> If sort is O(n log n), then we transform it into O(2 n^2 log n). >> >> Try to browse your package-cache if you're not a chicken. >> Seriously, it's unusable... >> >> Nicolas >> >> > |
In reply to this post by Chris Muller-3
thanks for the report.
we should investigate that. Stef On May 11, 2012, at 3:20 AM, Chris Muller wrote: > Please excuse my clarification but... this is a Pharo issue not a > Squeak issue. Squeak is fine and always has been. > > Time to display my package-cache directory in Squeak 4.3: < 1 second > > Time to display my package-cache directory in Pharo 1.4: > 5 minutes > and still waiting before I give up and kill -9.. > > > > On Wed, May 9, 2012 at 5:33 PM, Schwab,Wilhelm K <[hidden email]> wrote: >> This is one of those things that I find shocking. Squeak has been around for 15+ years, and still has no efficient way to get files matching a wildcard pattern (I sure couldn't find it), FFI does not (easily) support callbacks, etc. >> >> >> >> >> ________________________________________ >> From: [hidden email] [[hidden email]] on behalf of Nicolas Cellier [[hidden email]] >> Sent: Wednesday, May 09, 2012 6:18 PM >> To: Pharo Development >> Subject: [Pharo-project] FileList efficiency could hardly be worse >> >> The efficiency of FileList>>listForPattern: is something which should >> deserve a bit more care. >> >> It tries to sort the files, for example by name, but wants to display >> directory first >> ^ [ :x :y | |xIsDir| >> ((xIsDir := x isDirectory) = y isDirectory) >> ifTrue: [ x basename <= y basename ] >> ifFalse: [ >> "directories always precede files" >> xIsDir ]] >> >> Alas, this isDirectory test cost you an arm: >> >> FileReference>>isDirectory >> ^ filesystem isDirectory: path >> >> FileSystem>>isDirectory: aResolvable >> "Resolve the argument, and answer true if the result refers >> to a directory, false if it refers to a file or doesn't exist." >> >> ^ store isDirectory: (self resolve: aResolvable) >> >> FileSystemStore>>isDirectory: aPath >> aPath isRoot ifTrue: [ ^ true ]. >> self >> nodeAt: aPath >> ifPresent: [ :entry | ^ self basicIsDirectory: entry ] >> ifAbsent: [ ^ false ]. >> >> DiskStore>>nodeAt: aPath ifPresent: presentBlock ifAbsent: absentBlock >> | name| >> aPath isRoot ifTrue: [ ^ presentBlock value: self rootNode ]. >> "| encodedPath encodedBasename entry | >> encodedPath := Primitives encode: (self stringFromPath: aPath parent). >> encodedBasename := Primitives encode: aPath basename. >> entry := Primitives lookupDirectory: encodedPath filename: encodedBasename. >> ^ entry == #badDirectoryPath >> ifTrue: absentBlock >> ifFalse: [ >> entry at: 1 put: aPath basename. >> presentBlock value: entry ]." >> name := aPath basename. >> self >> directoryAt: aPath parent >> ifAbsent: absentBlock >> nodesDo: >> [ :entry | >> (self filename: (entry at: 1) matches: name) >> ifTrue: [ ^ presentBlock value: entry ] ]. >> ^ absentBlock value >> >> Arghh, it scans the whole parent directory again! >> If sort is O(n log n), then we transform it into O(2 n^2 log n). >> >> Try to browse your package-cache if you're not a chicken. >> Seriously, it's unusable... >> >> Nicolas >> >> > |
In reply to this post by Camillo Bruni-3
2012/5/11 Camillo Bruni <[hidden email]>:
> I just checked: > - Using the old FileDirectory has the advantage that we keep the low level entries directory => O(n^n) > - FIleSystem doesn't do that ;) => O(n^3) > > we should really switch to the other, direct access primitive! > The code for switching to lookupDirectory:filename: is there but commented. The primitive works OK on my mac with Eliot's VM... But the commented code only handle the case of bad directory... If directory exists, but does not contain the named file, then it answers nil... Find updated code at: http://code.google.com/p/pharo/issues/detail?id=5880 Nicolas > On 2012-05-11, at 03:20, Chris Muller wrote: >> Please excuse my clarification but... this is a Pharo issue not a >> Squeak issue. Squeak is fine and always has been. >> >> Time to display my package-cache directory in Squeak 4.3: < 1 second >> >> Time to display my package-cache directory in Pharo 1.4: > 5 minutes >> and still waiting before I give up and kill -9.. >> >> >> >> On Wed, May 9, 2012 at 5:33 PM, Schwab,Wilhelm K <[hidden email]> wrote: >>> This is one of those things that I find shocking. Squeak has been around for 15+ years, and still has no efficient way to get files matching a wildcard pattern (I sure couldn't find it), FFI does not (easily) support callbacks, etc. >>> >>> >>> >>> >>> ________________________________________ >>> From: [hidden email] [[hidden email]] on behalf of Nicolas Cellier [[hidden email]] >>> Sent: Wednesday, May 09, 2012 6:18 PM >>> To: Pharo Development >>> Subject: [Pharo-project] FileList efficiency could hardly be worse >>> >>> The efficiency of FileList>>listForPattern: is something which should >>> deserve a bit more care. >>> >>> It tries to sort the files, for example by name, but wants to display >>> directory first >>> ^ [ :x :y | |xIsDir| >>> ((xIsDir := x isDirectory) = y isDirectory) >>> ifTrue: [ x basename <= y basename ] >>> ifFalse: [ >>> "directories always precede files" >>> xIsDir ]] >>> >>> Alas, this isDirectory test cost you an arm: >>> >>> FileReference>>isDirectory >>> ^ filesystem isDirectory: path >>> >>> FileSystem>>isDirectory: aResolvable >>> "Resolve the argument, and answer true if the result refers >>> to a directory, false if it refers to a file or doesn't exist." >>> >>> ^ store isDirectory: (self resolve: aResolvable) >>> >>> FileSystemStore>>isDirectory: aPath >>> aPath isRoot ifTrue: [ ^ true ]. >>> self >>> nodeAt: aPath >>> ifPresent: [ :entry | ^ self basicIsDirectory: entry ] >>> ifAbsent: [ ^ false ]. >>> >>> DiskStore>>nodeAt: aPath ifPresent: presentBlock ifAbsent: absentBlock >>> | name| >>> aPath isRoot ifTrue: [ ^ presentBlock value: self rootNode ]. >>> "| encodedPath encodedBasename entry | >>> encodedPath := Primitives encode: (self stringFromPath: aPath parent). >>> encodedBasename := Primitives encode: aPath basename. >>> entry := Primitives lookupDirectory: encodedPath filename: encodedBasename. >>> ^ entry == #badDirectoryPath >>> ifTrue: absentBlock >>> ifFalse: [ >>> entry at: 1 put: aPath basename. >>> presentBlock value: entry ]." >>> name := aPath basename. >>> self >>> directoryAt: aPath parent >>> ifAbsent: absentBlock >>> nodesDo: >>> [ :entry | >>> (self filename: (entry at: 1) matches: name) >>> ifTrue: [ ^ presentBlock value: entry ] ]. >>> ^ absentBlock value >>> >>> Arghh, it scans the whole parent directory again! >>> If sort is O(n log n), then we transform it into O(2 n^2 log n). >>> >>> Try to browse your package-cache if you're not a chicken. >>> Seriously, it's unusable... >>> >>> Nicolas >>> >>> >> > > |
Free forum by Nabble | Edit this page |