Binary file I/O performance problems

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Re: Binary file I/O performance problems

Herbert König
Hello David,


YO>   I'm sure that there are other implications, but it sounds like you
YO> do need some primitives to make it efficient.  I would make a
YO> primitive that is equivalent of read_xyza_ping() that fills a Squeak
YO> object, or if you are dealing with array of XYZA_Ping structure,
YO> making an array of homogeneous arrays so that all linenames are stored
YO> in a ByteArray, all pingnums are stored in a WordArray, etc.  In this
YO> way, you may still be able to utilize the vector primitives.

this approach seems to give a chance of solving the sped problem.

In your original post you talked about 10 significant figures, so be
aware that float array only is 32 bit floats with only about 8
significant figures.

The second caveat is if many of your floats are in the range of 1e-38
(the closet to zero number of 32 Bit Float) Float array gets very slow
(speed degradation by a factor of 8).  I'm talking about FloatArray>>*
and *= here.

Sorry if I sound negative I just think its bad to ignore problems that
are know in advance.


--
Cheers,

Herbert  

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Re: Binary file I/O performance problems

David Finlayson-4
I have implemented a number of signal processing programs in both C99
and Python (with psyco jit). I have an 8-core Mac Pro workstation
which I can use as for parallel processing by launching multiple
instances of the code using Make scripts. An interesting thing
happened when I compared the performance of the C code to the Python
code:

The C code became I/O bound at 4 cores saturating either the disks or
the memory bus (I am not sure exactly where the bottleneck is). While
the Python version never became I/O bound at 8 cores, it did however
close to within a factor of 10 of the performance of the C code. This
suggested to me that If I had enough processors to saturate the I/O
there was no speed advantage of writing the code in C.

The next generation of workstations we buy will probably have dozens
of cores but hard drives and memory will only be marginally faster (if
history is any indication). So, if I/O is the rate limiting factor,
not cpu speed, why not look for the most productive programing
environment possible? I've always read that Smalltalk is often
considered the most productive programing environment ever invented.
So I wanted to give it a try. But I am discovering (from the point of
view of a scientist programmer like myself) it lacks a lot in
comparison to Matlab or Python (both high-level) and especially C and
C++ (lots and lots of library code).

I am going to have to weigh the pros and cons of whether it makes
since to push on with this.

David
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
Reply | Threaded
Open this post in threaded view
|

Re: Re: Binary file I/O performance problems

Yoshiki Ohshima-2
At Sat, 6 Sep 2008 08:29:35 -0700,
David Finlayson wrote:

>
> The next generation of workstations we buy will probably have dozens
> of cores but hard drives and memory will only be marginally faster (if
> history is any indication). So, if I/O is the rate limiting factor,
> not cpu speed, why not look for the most productive programing
> environment possible? I've always read that Smalltalk is often
> considered the most productive programing environment ever invented.
> So I wanted to give it a try. But I am discovering (from the point of
> view of a scientist programmer like myself) it lacks a lot in
> comparison to Matlab or Python (both high-level) and especially C and
> C++ (lots and lots of library code).

  That observation on the sophistication level is quite right.  And,
Squeak's moving/compacting GC would give you some more penalty
compared to other implementations when it involves 10's of MB to GB of
data.

> I am going to have to weigh the pros and cons of whether it makes
> since to push on with this.

  We tend to do something just ok for its own need, but listening to the
other people's needs is always fun (and depressing^^).

-- Yoshiki
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners
12