I have been doing some file intensive activities and found my program to
be VERY slow (see at the end). Just to be sure I ran them in Java and found it was much faster So I did a small test: --- [10 timesRepeat: [i := 0. '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents do: [ :c | i:= i+1]. ] ] timeToRunWithoutGC. --- result = 12.932 sec similar thing (as far as I can tell) 10 times in java: 1.482 sec. --- public static void main(String[] args) { int length =0; try { String filename = "/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse"; String content = new String(Files.readAllBytes(Paths.get(filename)), "UTF8"); for (int i=0; i < content.length(); i++) { content.charAt(i); length = length+1; } } catch (IOException e) { e.printStackTrace(); } System.out.println(length); } --- Because my program is MUCH slower (see at the end) in Smalltalk than in Java, I did another experiment: --- [1 to: 10 do: [:i| 1 to: 100000000 do: [:j | String new] ] ] timeToRunWithoutGC. --- result = 33.063 sec and in java: 4.382 sec. ---[10 runs of] public static void main(String[] args) { for (int i=0; i < 100000000; i++) { new String(); } } --- Concretly, my need was: Take 2600 methods in a Moose model, take their source code (therefore reading files), for methods longer than 100 lines (there are 29 of them), go through there code to find the blocks (matching {}). In smalltalk it ran > 12hours and I had processed 5 methods of the 29 long ones I reimplemented in Java (basically, just changing from pharo to java syntax) and it took 1 minutes to compute everything ... :-( On the good side, it was much easier to program it in smalltalk (about half a day to think about the algorithm, experiement, implement, test) than in Java (another 1/2 day, just to recode the algorithm that already worked). nicolas |
Some speed difference is to be expected, but the numbers at the end are too extreme.
You are mixing micro and macro benchmarks. The micro benchmarks indicate a 10x slowdown, the macro benchmark 700x ! Maybe your algorithm is not optimally implemented in Pharo ? > On 16 Mar 2015, at 09:49, Nicolas Anquetil <[hidden email]> wrote: > > I have been doing some file intensive activities and found my program to be VERY slow (see at the end). > Just to be sure I ran them in Java and found it was much faster > > So I did a small test: > --- > [10 timesRepeat: [i := 0. > '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents do: [ :c | i:= i+1]. > ] ] timeToRunWithoutGC. > --- > > result = 12.932 sec > > similar thing (as far as I can tell) 10 times in java: 1.482 sec. > --- > public static void main(String[] args) { > int length =0; > try { > String filename = "/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse"; > String content = new String(Files.readAllBytes(Paths.get(filename)), "UTF8"); > for (int i=0; i < content.length(); i++) { > content.charAt(i); > length = length+1; > } > } catch (IOException e) { > e.printStackTrace(); > } > System.out.println(length); > } > --- > > Because my program is MUCH slower (see at the end) in Smalltalk than in Java, I did another experiment: > > --- > [1 to: 10 do: [:i| 1 to: 100000000 do: [:j | String new] ] ] timeToRunWithoutGC. > --- > > result = 33.063 sec > > and in java: 4.382 sec. > ---[10 runs of] > public static void main(String[] args) { > for (int i=0; i < 100000000; i++) { > new String(); > } > } > --- > > > > > Concretly, my need was: > Take 2600 methods in a Moose model, take their source code (therefore reading files), for methods longer than 100 lines (there are 29 of them), go through there code to find the blocks (matching {}). > In smalltalk it ran > 12hours and I had processed 5 methods of the 29 long ones > I reimplemented in Java (basically, just changing from pharo to java syntax) and it took 1 minutes to compute everything ... > > :-( > > On the good side, it was much easier to program it in smalltalk (about half a day to think about the algorithm, experiement, implement, test) than in Java (another 1/2 day, just to recode the algorithm that already worked). > > nicolas > |
On 16/03/15 10:14, Sven Van Caekenberghe wrote:
> Some speed difference is to be expected, but the numbers at the end are too extreme. > > The micro benchmarks indicate a 10x slowdown, the macro benchmark 700x ! The algorithm is not the same. Macro is 4000 times slower (29/5*700). Some of the navigations/iterators in moose models do too much work and are not just iterating over a simple array. Stephan |
In reply to this post by Nicolas Anquetil
To compare more accurately try something equivalent to
| a | a := '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents. 1 to: a size do: [ :i| | c | c:= a at: i. i:= i+1]. Eliot (phone) On Mar 16, 2015, at 1:49 AM, Nicolas Anquetil <[hidden email]> wrote: > '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents do: [ :c | i:= i+1]. |
In reply to this post by Nicolas Anquetil
Eliot, Sven, Stephan, thank you for your answers. As you noticed I am not an expert in profiling :-) it seems now I might have goofed up and the time taken by pharo in my initial program (compared to java) is due to some other extra compilation I was doing. So the "macro benchmark" might be wrong Still the "micro benchmark" still holds I tested the code proposed by Elliot and the result is .... --- [1 to: 10 do: [:j || a length | length:=0. a := '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents. 1 to: a size do: [ :i| | c | c:= a at: i. length:= length+1]]] timeToRunWithoutGC --- 12.723 sec. [reminder] For java it is: 1.482 sec. so it is still a factor 8 or 9 it seems a lot for such a simple thing, no? (or maybe not, I don't know) nicolas On 16/03/2015 09:49, Nicolas Anquetil wrote: > I have been doing some file intensive activities and found my program > to be VERY slow (see at the end). > Just to be sure I ran them in Java and found it was much faster > > So I did a small test: > --- > [10 timesRepeat: [i := 0. > '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' > asFileReference readStream contents do: [ :c | i:= i+1]. > ] ] timeToRunWithoutGC. > --- > > result = 12.932 sec > > similar thing (as far as I can tell) 10 times in java: 1.482 sec. > --- > public static void main(String[] args) { > int length =0; > try { > String filename = > "/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse"; > String content = new > String(Files.readAllBytes(Paths.get(filename)), "UTF8"); > for (int i=0; i < content.length(); i++) { > content.charAt(i); > length = length+1; > } > } catch (IOException e) { > e.printStackTrace(); > } > System.out.println(length); > } > --- > > Because my program is MUCH slower (see at the end) in Smalltalk than > in Java, I did another experiment: > > --- > [1 to: 10 do: [:i| 1 to: 100000000 do: [:j | String new] ] ] > timeToRunWithoutGC. > --- > > result = 33.063 sec > > and in java: 4.382 sec. > ---[10 runs of] > public static void main(String[] args) { > for (int i=0; i < 100000000; i++) { > new String(); > } > } > --- > > > > > Concretly, my need was: > Take 2600 methods in a Moose model, take their source code (therefore > reading files), for methods longer than 100 lines (there are 29 of > them), go through there code to find the blocks (matching {}). > In smalltalk it ran > 12hours and I had processed 5 methods of the 29 > long ones > I reimplemented in Java (basically, just changing from pharo to java > syntax) and it took 1 minutes to compute everything ... > > :-( > > On the good side, it was much easier to program it in smalltalk (about > half a day to think about the algorithm, experiement, implement, test) > than in Java (another 1/2 day, just to recode the algorithm that > already worked). > > nicolas > |
Can you post/share your file (jfreechart-0_9_0.mse) somewhere so we can run the same test ?
Also, in your Java code I do not see a loop doing the benchmark 10 times ... > On 17 Mar 2015, at 10:19, Nicolas Anquetil <[hidden email]> wrote: > > > Eliot, Sven, Stephan, > > thank you for your answers. > > As you noticed I am not an expert in profiling :-) > > it seems now I might have goofed up and the time taken by pharo in my initial program (compared to java) is due to some other extra compilation I was doing. > > So the "macro benchmark" might be wrong > > Still the "micro benchmark" still holds > I tested the code proposed by Elliot and the result is .... > > --- > [1 to: 10 do: [:j || a length | > length:=0. > a := '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents. > 1 to: a size do: [ :i| | c | c:= a at: i. length:= length+1]]] timeToRunWithoutGC > --- > > 12.723 sec. > > [reminder] For java it is: 1.482 sec. > > so it is still a factor 8 or 9 > it seems a lot for such a simple thing, no? > (or maybe not, I don't know) > > nicolas > > On 16/03/2015 09:49, Nicolas Anquetil wrote: >> I have been doing some file intensive activities and found my program to be VERY slow (see at the end). >> Just to be sure I ran them in Java and found it was much faster >> >> So I did a small test: >> --- >> [10 timesRepeat: [i := 0. >> '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents do: [ :c | i:= i+1]. >> ] ] timeToRunWithoutGC. >> --- >> >> result = 12.932 sec >> >> similar thing (as far as I can tell) 10 times in java: 1.482 sec. >> --- >> public static void main(String[] args) { >> int length =0; >> try { >> String filename = "/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse"; >> String content = new String(Files.readAllBytes(Paths.get(filename)), "UTF8"); >> for (int i=0; i < content.length(); i++) { >> content.charAt(i); >> length = length+1; >> } >> } catch (IOException e) { >> e.printStackTrace(); >> } >> System.out.println(length); >> } >> --- >> >> Because my program is MUCH slower (see at the end) in Smalltalk than in Java, I did another experiment: >> >> --- >> [1 to: 10 do: [:i| 1 to: 100000000 do: [:j | String new] ] ] timeToRunWithoutGC. >> --- >> >> result = 33.063 sec >> >> and in java: 4.382 sec. >> ---[10 runs of] >> public static void main(String[] args) { >> for (int i=0; i < 100000000; i++) { >> new String(); >> } >> } >> --- >> >> >> >> >> Concretly, my need was: >> Take 2600 methods in a Moose model, take their source code (therefore reading files), for methods longer than 100 lines (there are 29 of them), go through there code to find the blocks (matching {}). >> In smalltalk it ran > 12hours and I had processed 5 methods of the 29 long ones >> I reimplemented in Java (basically, just changing from pharo to java syntax) and it took 1 minutes to compute everything ... >> >> :-( >> >> On the good side, it was much easier to program it in smalltalk (about half a day to think about the algorithm, experiement, implement, test) than in Java (another 1/2 day, just to recode the algorithm that already worked). >> >> nicolas >> > > |
Yeah, put the file on a dropbox somewhere and share the link.
I'd like to see why this is "slow". I am reading tons of data from a MongoDb and it is superfast. Phil On Tue, Mar 17, 2015 at 10:24 AM, Sven Van Caekenberghe <[hidden email]> wrote: > Can you post/share your file (jfreechart-0_9_0.mse) somewhere so we can run the same test ? > > Also, in your Java code I do not see a loop doing the benchmark 10 times ... > >> On 17 Mar 2015, at 10:19, Nicolas Anquetil <[hidden email]> wrote: >> >> >> Eliot, Sven, Stephan, >> >> thank you for your answers. >> >> As you noticed I am not an expert in profiling :-) >> >> it seems now I might have goofed up and the time taken by pharo in my initial program (compared to java) is due to some other extra compilation I was doing. >> >> So the "macro benchmark" might be wrong >> >> Still the "micro benchmark" still holds >> I tested the code proposed by Elliot and the result is .... >> >> --- >> [1 to: 10 do: [:j || a length | >> length:=0. >> a := '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents. >> 1 to: a size do: [ :i| | c | c:= a at: i. length:= length+1]]] timeToRunWithoutGC >> --- >> >> 12.723 sec. >> >> [reminder] For java it is: 1.482 sec. >> >> so it is still a factor 8 or 9 >> it seems a lot for such a simple thing, no? >> (or maybe not, I don't know) >> >> nicolas >> >> On 16/03/2015 09:49, Nicolas Anquetil wrote: >>> I have been doing some file intensive activities and found my program to be VERY slow (see at the end). >>> Just to be sure I ran them in Java and found it was much faster >>> >>> So I did a small test: >>> --- >>> [10 timesRepeat: [i := 0. >>> '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents do: [ :c | i:= i+1]. >>> ] ] timeToRunWithoutGC. >>> --- >>> >>> result = 12.932 sec >>> >>> similar thing (as far as I can tell) 10 times in java: 1.482 sec. >>> --- >>> public static void main(String[] args) { >>> int length =0; >>> try { >>> String filename = "/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse"; >>> String content = new String(Files.readAllBytes(Paths.get(filename)), "UTF8"); >>> for (int i=0; i < content.length(); i++) { >>> content.charAt(i); >>> length = length+1; >>> } >>> } catch (IOException e) { >>> e.printStackTrace(); >>> } >>> System.out.println(length); >>> } >>> --- >>> >>> Because my program is MUCH slower (see at the end) in Smalltalk than in Java, I did another experiment: >>> >>> --- >>> [1 to: 10 do: [:i| 1 to: 100000000 do: [:j | String new] ] ] timeToRunWithoutGC. >>> --- >>> >>> result = 33.063 sec >>> >>> and in java: 4.382 sec. >>> ---[10 runs of] >>> public static void main(String[] args) { >>> for (int i=0; i < 100000000; i++) { >>> new String(); >>> } >>> } >>> --- >>> >>> >>> >>> >>> Concretly, my need was: >>> Take 2600 methods in a Moose model, take their source code (therefore reading files), for methods longer than 100 lines (there are 29 of them), go through there code to find the blocks (matching {}). >>> In smalltalk it ran > 12hours and I had processed 5 methods of the 29 long ones >>> I reimplemented in Java (basically, just changing from pharo to java syntax) and it took 1 minutes to compute everything ... >>> >>> :-( >>> >>> On the good side, it was much easier to program it in smalltalk (about half a day to think about the algorithm, experiement, implement, test) than in Java (another 1/2 day, just to recode the algorithm that already worked). >>> >>> nicolas >>> >> >> > > |
the file is 10M. it seems to me the content does not change anything since we are just reading it character by character without doing anything else. anyway, you can find it at: https://dl.dropboxusercontent.com/u/12861461/jfreechart-0_9_0.mse nicolas On 17/03/2015 11:04, [hidden email] wrote: > Yeah, put the file on a dropbox somewhere and share the link. > > I'd like to see why this is "slow". I am reading tons of data from a > MongoDb and it is superfast. > > Phil > > On Tue, Mar 17, 2015 at 10:24 AM, Sven Van Caekenberghe <[hidden email]> wrote: >> Can you post/share your file (jfreechart-0_9_0.mse) somewhere so we can run the same test ? >> >> Also, in your Java code I do not see a loop doing the benchmark 10 times ... >> >>> On 17 Mar 2015, at 10:19, Nicolas Anquetil <[hidden email]> wrote: >>> >>> >>> Eliot, Sven, Stephan, >>> >>> thank you for your answers. >>> >>> As you noticed I am not an expert in profiling :-) >>> >>> it seems now I might have goofed up and the time taken by pharo in my initial program (compared to java) is due to some other extra compilation I was doing. >>> >>> So the "macro benchmark" might be wrong >>> >>> Still the "micro benchmark" still holds >>> I tested the code proposed by Elliot and the result is .... >>> >>> --- >>> [1 to: 10 do: [:j || a length | >>> length:=0. >>> a := '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents. >>> 1 to: a size do: [ :i| | c | c:= a at: i. length:= length+1]]] timeToRunWithoutGC >>> --- >>> >>> 12.723 sec. >>> >>> [reminder] For java it is: 1.482 sec. >>> >>> so it is still a factor 8 or 9 >>> it seems a lot for such a simple thing, no? >>> (or maybe not, I don't know) >>> >>> nicolas >>> >>> On 16/03/2015 09:49, Nicolas Anquetil wrote: >>>> I have been doing some file intensive activities and found my program to be VERY slow (see at the end). >>>> Just to be sure I ran them in Java and found it was much faster >>>> >>>> So I did a small test: >>>> --- >>>> [10 timesRepeat: [i := 0. >>>> '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' asFileReference readStream contents do: [ :c | i:= i+1]. >>>> ] ] timeToRunWithoutGC. >>>> --- >>>> >>>> result = 12.932 sec >>>> >>>> similar thing (as far as I can tell) 10 times in java: 1.482 sec. >>>> --- >>>> public static void main(String[] args) { >>>> int length =0; >>>> try { >>>> String filename = "/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse"; >>>> String content = new String(Files.readAllBytes(Paths.get(filename)), "UTF8"); >>>> for (int i=0; i < content.length(); i++) { >>>> content.charAt(i); >>>> length = length+1; >>>> } >>>> } catch (IOException e) { >>>> e.printStackTrace(); >>>> } >>>> System.out.println(length); >>>> } >>>> --- >>>> >>>> Because my program is MUCH slower (see at the end) in Smalltalk than in Java, I did another experiment: >>>> >>>> --- >>>> [1 to: 10 do: [:i| 1 to: 100000000 do: [:j | String new] ] ] timeToRunWithoutGC. >>>> --- >>>> >>>> result = 33.063 sec >>>> >>>> and in java: 4.382 sec. >>>> ---[10 runs of] >>>> public static void main(String[] args) { >>>> for (int i=0; i < 100000000; i++) { >>>> new String(); >>>> } >>>> } >>>> --- >>>> >>>> >>>> >>>> >>>> Concretly, my need was: >>>> Take 2600 methods in a Moose model, take their source code (therefore reading files), for methods longer than 100 lines (there are 29 of them), go through there code to find the blocks (matching {}). >>>> In smalltalk it ran > 12hours and I had processed 5 methods of the 29 long ones >>>> I reimplemented in Java (basically, just changing from pharo to java syntax) and it took 1 minutes to compute everything ... >>>> >>>> :-( >>>> >>>> On the good side, it was much easier to program it in smalltalk (about half a day to think about the algorithm, experiement, implement, test) than in Java (another 1/2 day, just to recode the algorithm that already worked). >>>> >>>> nicolas >>>> >>> >> |
In reply to this post by Sven Van Caekenberghe-2
> On 17 Mar 2015, at 10:24, Sven Van Caekenberghe <[hidden email]> wrote: > > Also, in your Java code I do not see a loop doing the benchmark 10 times ... Just to make sure: where is your 10x loop in Java ? similar thing (as far as I can tell) 10 times in java: 1.482 sec. --- public static void main(String[] args) { int length =0; try { String filename = "/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse"; String content = new String(Files.readAllBytes(Paths.get(filename)), "UTF8"); for (int i=0; i < content.length(); i++) { content.charAt(i); length = length+1; } } catch (IOException e) { e.printStackTrace(); } System.out.println(length); } --- |
in the bash: --- $ for i in 1 2 3 4 5 6 7 8 9 10; do time java PharoJava; done --- then I did the sum manually :-) nicolas On 17/03/2015 13:47, Sven Van Caekenberghe wrote: >> On 17 Mar 2015, at 10:24, Sven Van Caekenberghe <[hidden email]> wrote: >> >> Also, in your Java code I do not see a loop doing the benchmark 10 times ... > Just to make sure: where is your 10x loop in Java ? > > similar thing (as far as I can tell) 10 times in java: 1.482 sec. > --- > public static void main(String[] args) { > int length =0; > try { > String filename = "/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse"; > String content = new String(Files.readAllBytes(Paths.get(filename)), "UTF8"); > for (int i=0; i < content.length(); i++) { > content.charAt(i); > length = length+1; > } > } catch (IOException e) { > e.printStackTrace(); > } > System.out.println(length); > } > --- > |
In reply to this post by Sven Van Caekenberghe-2
I tried it myself, java seems to be 7 times faster on a 35 MB
jfreechart.mse file I found on github. Moose 5.1 managed about 30 MB/s. UTF8 is rather suboptimal for source code. Nearly all of it is ASCII which can be processed a machine word at a time, instead of byte. There were earlier discussions about that http://forum.world.st/Fastest-utf-8-encoder-contest-td4634566.html Stephan |
> On 17 Mar 2015, at 15:45, Stephan Eggermont <[hidden email]> wrote: > > I tried it myself, java seems to be 7 times faster on a 35 MB jfreechart.mse file I found on github. Moose 5.1 managed about > 30 MB/s. > > UTF8 is rather suboptimal for source code. Nearly all of it is > ASCII which can be processed a machine word at a time, instead of byte. There were earlier discussions about that > http://forum.world.st/Fastest-utf-8-encoder-contest-td4634566.html > > Stephan Thanks for the pointer to the file (finally !). Using this file: https://raw.githubusercontent.com/mircealungu/experiments-polymorphism/master/fileouts/jfreechart.mse which is indeed 35Mb we can do better. Since (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | in contents allSatisfy: [ :each | each < 127 ]. is true, we can skip decoding. For me, it is pretty fast now [ | count | count := 0. (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | in contents do: [ :each | count := count + 1 ] ]. count ] timeToRun. "0:00:00:00.637" Adding UTF8 decoding (implemented in Pharo) makes it 10x slower [ | count | count := 0. (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | in contents utf8Decoded do: [ :each | count := count + 1 ] ]. count ] timeToRun. "0:00:00:07.45" HTH, Sven |
In reply to this post by Nicolas Anquetil
Ok, I've tried this out.
First version ---------------- [ |length a| length := 0. 1 to: 10 do: [ :index | ('Loop {1}' format: { index }) logCr. a := (FileLocator imageDirectory / 'javacomp' / 'jfreechart-0_9_0.mse') readStream contents. (ReadStream on: a) do: [ :c | length := length + 1. ]. length asString logCr. ]] timeToRun 0:00:00:22.33 Takes a lot of time. Second version (streaming, less memory intensive) --------------------------------------------------------------------- [ |length c| length := 0. 1 to: 10 do: [ :index | ('Loop {1}' format: { index }) logCr. (FileLocator imageDirectory / 'javacomp' / 'jfreechart-0_9_0.mse') readStreamDo: [ :s | [ s atEnd ] whileFalse: [ c := s next. length := length + 1. ] ]. length asString logCr. ]] timeToRun 0:00:00:03.683 Already better. But profiling version 1 showed the issue. We dealing with a multibyte stream there. So, switching to a StandardFileStream gives Version 3 ------------- [ |length a| length := 0. 1 to: 10 do: [ :index | ('Loop {1}' format: { index }) logCr. a := (StandardFileStream fileNamed: (FileLocator imageDirectory / 'javacomp' / 'jfreechart-0_9_0.mse') pathString) readStream contents. a do: [ :c | length := length + 1. ]. length asString logCr. ]] timeToRun 0:00:00:03.18 I see that Java does Files.readAllBytes(Paths.get(filename)), "UTF8") readAllBytes sees suspect to me, even with UTF8. Looks like a standard file stream with no conversion. Pharo isn't so slow after all. HTH Phil On Tue, Mar 17, 2015 at 1:21 PM, Nicolas Anquetil <[hidden email]> wrote: > > the file is 10M. > > it seems to me the content does not change anything since we are just > reading it character by character without doing anything else. > > anyway, you can find it at: > https://dl.dropboxusercontent.com/u/12861461/jfreechart-0_9_0.mse > > nicolas > > On 17/03/2015 11:04, [hidden email] wrote: >> >> Yeah, put the file on a dropbox somewhere and share the link. >> >> I'd like to see why this is "slow". I am reading tons of data from a >> MongoDb and it is superfast. >> >> Phil >> >> On Tue, Mar 17, 2015 at 10:24 AM, Sven Van Caekenberghe <[hidden email]> >> wrote: >>> >>> Can you post/share your file (jfreechart-0_9_0.mse) somewhere so we can >>> run the same test ? >>> >>> Also, in your Java code I do not see a loop doing the benchmark 10 times >>> ... >>> >>>> On 17 Mar 2015, at 10:19, Nicolas Anquetil <[hidden email]> >>>> wrote: >>>> >>>> >>>> Eliot, Sven, Stephan, >>>> >>>> thank you for your answers. >>>> >>>> As you noticed I am not an expert in profiling :-) >>>> >>>> it seems now I might have goofed up and the time taken by pharo in my >>>> initial program (compared to java) is due to some other extra compilation I >>>> was doing. >>>> >>>> So the "macro benchmark" might be wrong >>>> >>>> Still the "micro benchmark" still holds >>>> I tested the code proposed by Elliot and the result is .... >>>> >>>> --- >>>> [1 to: 10 do: [:j || a length | >>>> length:=0. >>>> a := >>>> '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' >>>> asFileReference readStream contents. >>>> 1 to: a size do: [ :i| | c | c:= a at: i. length:= length+1]]] >>>> timeToRunWithoutGC >>>> --- >>>> >>>> 12.723 sec. >>>> >>>> [reminder] For java it is: 1.482 sec. >>>> >>>> so it is still a factor 8 or 9 >>>> it seems a lot for such a simple thing, no? >>>> (or maybe not, I don't know) >>>> >>>> nicolas >>>> >>>> On 16/03/2015 09:49, Nicolas Anquetil wrote: >>>>> >>>>> I have been doing some file intensive activities and found my program >>>>> to be VERY slow (see at the end). >>>>> Just to be sure I ran them in Java and found it was much faster >>>>> >>>>> So I did a small test: >>>>> --- >>>>> [10 timesRepeat: [i := 0. >>>>> >>>>> '/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse' >>>>> asFileReference readStream contents do: [ :c | i:= i+1]. >>>>> ] ] timeToRunWithoutGC. >>>>> --- >>>>> >>>>> result = 12.932 sec >>>>> >>>>> similar thing (as far as I can tell) 10 times in java: 1.482 sec. >>>>> --- >>>>> public static void main(String[] args) { >>>>> int length =0; >>>>> try { >>>>> String filename = >>>>> "/home/anquetil/Documents/RMod/Tools/workspace/Blocks/jfreechart-0_9_0.mse"; >>>>> String content = new >>>>> String(Files.readAllBytes(Paths.get(filename)), "UTF8"); >>>>> for (int i=0; i < content.length(); i++) { >>>>> content.charAt(i); >>>>> length = length+1; >>>>> } >>>>> } catch (IOException e) { >>>>> e.printStackTrace(); >>>>> } >>>>> System.out.println(length); >>>>> } >>>>> --- >>>>> >>>>> Because my program is MUCH slower (see at the end) in Smalltalk than in >>>>> Java, I did another experiment: >>>>> >>>>> --- >>>>> [1 to: 10 do: [:i| 1 to: 100000000 do: [:j | String new] ] ] >>>>> timeToRunWithoutGC. >>>>> --- >>>>> >>>>> result = 33.063 sec >>>>> >>>>> and in java: 4.382 sec. >>>>> ---[10 runs of] >>>>> public static void main(String[] args) { >>>>> for (int i=0; i < 100000000; i++) { >>>>> new String(); >>>>> } >>>>> } >>>>> --- >>>>> >>>>> >>>>> >>>>> >>>>> Concretly, my need was: >>>>> Take 2600 methods in a Moose model, take their source code (therefore >>>>> reading files), for methods longer than 100 lines (there are 29 of them), >>>>> go through there code to find the blocks (matching {}). >>>>> In smalltalk it ran > 12hours and I had processed 5 methods of the 29 >>>>> long ones >>>>> I reimplemented in Java (basically, just changing from pharo to java >>>>> syntax) and it took 1 minutes to compute everything ... >>>>> >>>>> :-( >>>>> >>>>> On the good side, it was much easier to program it in smalltalk (about >>>>> half a day to think about the algorithm, experiement, implement, test) than >>>>> in Java (another 1/2 day, just to recode the algorithm that already worked). >>>>> >>>>> nicolas >>>>> >>>> >>> > > |
In reply to this post by Sven Van Caekenberghe-2
Ah, you beat me :-)
Still, your implementation isn't loading the whole contents as the Java version does. The key issue is the conversion indeed. Phil On Tue, Mar 17, 2015 at 4:17 PM, Sven Van Caekenberghe <[hidden email]> wrote: > >> On 17 Mar 2015, at 15:45, Stephan Eggermont <[hidden email]> wrote: >> >> I tried it myself, java seems to be 7 times faster on a 35 MB jfreechart.mse file I found on github. Moose 5.1 managed about >> 30 MB/s. >> >> UTF8 is rather suboptimal for source code. Nearly all of it is >> ASCII which can be processed a machine word at a time, instead of byte. There were earlier discussions about that >> http://forum.world.st/Fastest-utf-8-encoder-contest-td4634566.html >> >> Stephan > > Thanks for the pointer to the file (finally !). > > Using this file: https://raw.githubusercontent.com/mircealungu/experiments-polymorphism/master/fileouts/jfreechart.mse which is indeed 35Mb we can do better. > > Since > > (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | > in contents allSatisfy: [ :each | each < 127 ]. > > is true, we can skip decoding. > > For me, it is pretty fast now > > [ > | count | > count := 0. > (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | > in contents do: [ :each | count := count + 1 ] ]. > count > ] timeToRun. > > "0:00:00:00.637" > > Adding UTF8 decoding (implemented in Pharo) makes it 10x slower > > [ > | count | > count := 0. > (FileLocator desktop / 'jfreechart.mse') binaryReadStreamDo: [ :in | > in contents utf8Decoded do: [ :each | count := count + 1 ] ]. > count > ] timeToRun. "0:00:00:07.45" > > HTH, > > Sven > > > |
> On 17 Mar 2015, at 16:29, [hidden email] wrote: > > Still, your implementation isn't loading the whole contents as the Java version does. Well, your third piece of code, which uses StandardFileStream, does not do any UTF-8 decoding either. It uses FileStream>>#contents, like my binary stream (ByteArray and ByteString are interchangeable at this level). Anyway, our conclusion is the same. |
In reply to this post by Nicolas Anquetil
Hi Nicolas,
On Tue, Mar 17, 2015 at 2:19 AM, Nicolas Anquetil <[hidden email]> wrote:
Indeed it is. But remember that Java VMs are typically extremely efficient, employing aggressive optimization. What Java VM are you using? But do not despair! We are working on it. Let me boil down your benchmark and measure it in Cog (the current standard VM) and Spur (the in-development improvement that should become the standard Pharo VM by Pharo 5). Clément and I are also working on Sista, which is an adaptive optimizer for Cog/Spur that should further increase performance. [| a n | a := String new: 10 * 1024 * 1024. n := 0. 1 to: a size do: [:i| | c | c := a at: i. n := n + 1]] bench Cog: '1.99 per second. 502 milliseconds per run.' Spur: '12.2 per second. 82.1 milliseconds per run.' [| a n | a := String new: 10 * 1024 * 1024. n := 0. 1 to: a size do: [:i| n := n + 1]] bench Cog: '2.93 per second. 341 milliseconds per run.' Spur: '23.8 per second. 42.1 milliseconds per run.' Now the Java VM's optimizer is probably able to eliminate the at:, so we would expect it to run "| c | c := a at: i. n := n + 1" at the same speed as it runs "n := n + 1". We expect Sista will be able to perform the same optimization. P.S. Of course, a programmer is able to realise that the above computation is equivalent to a := String new: 10 * 1024 * 1024. a size + 1 * (a size / 2) ;-)
best,
Eliot |
On 17/03/15 17:13, Eliot Miranda wrote:
> Hi Nicolas, > Indeed it is. But remember that Java VMs are typically extremely > efficient, employing aggressive optimization. What Java VM are you using? I was using an OpenSDK-7 on latest Ubuntu > > But do not despair! We are working on it. Let me boil down your > benchmark and measure it in Cog (the current standard VM) and Spur (the > in-development improvement that should become the standard Pharo VM by > Pharo 5). Clément and I are also working on Sista, which is an adaptive > optimizer for Cog/Spur that should further increase performance. I tried the code with the latest pharo-spur image and vm: from 17 seconds down to 10.5 Stephan |
On 17/03/2015 17:59, Stephan Eggermont wrote: > On 17/03/15 17:13, Eliot Miranda wrote: >> Hi Nicolas, >> Indeed it is. But remember that Java VMs are typically extremely >> efficient, employing aggressive optimization. What Java VM are you >> using? > > I was using an OpenSDK-7 on latest Ubuntu > same here: java-7-openjdk-amd64 >> >> But do not despair! We are working on it. Let me boil down your >> benchmark and measure it in Cog (the current standard VM) and Spur (the >> in-development improvement that should become the standard Pharo VM by >> Pharo 5). Clément and I are also working on Sista, which is an adaptive >> optimizer for Cog/Spur that should further increase performance. > I am not despairing :-) I will look into the suggestion of Sven and Phil with Binary Stream nicolas > I tried the code with the latest pharo-spur image and vm: > from 17 seconds down to 10.5 > > Stephan > > > |
In reply to this post by Stephan Eggermont-3
On 17/03/15 17:59, Stephan Eggermont wrote:
> I tried the code with the latest pharo-spur image and vm: > from 17 seconds down to 10.5 And I tried it again with cogspurlinux 3268 and the trunk46-spur. That needed switching to MultiByteFileStream readOnlyFileNamed: and ran in 8.8 sec (average of three runs). Interestingly, on Pharo that is significantly slower, about 15 sec. replacing that by StandardFileStream (and no decoding) reduced it to <120 ms. Stephan [1 to: 10 do: [:j | |a length| length := 0. a := (MultiByteFileStream readOnlyFileNamed: '/home/stephan/Downloads/jfreechart.mse') readStream contents]] timeToRunWithoutGC |
Maybe using FFI to read the file in the correct format would be a nice option to have available. The code in the MultibyteFileStream looks quite convoluted when reading. There is reason we need a 64 it VM to read a UTF8 file fast. (8secs really does not qualify) Opinions? Phil > |
Free forum by Nabble | Edit this page |