Smalltalk › Squeak › Squeak - Beginners

Binary file I/O performance problems

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

23 messages Options

David Finlayson-4

Binary file I/O performance problems

I've been working on my first Smalltalk program which needs to read
and write large c structs from a binary file. I wrote two classes
BinaryStreamReader and BinaryStreamWriter that take a stream and can
read (or write) all of the integer and floating point types I need
(also handles byte-swapping if necessary). I wrote a test program that
focuses on just reading a small (for us) 123 Mb data file on disk. The
program takes about 166 seconds to run compared to 1.2 seconds for an
equivalent C version (140x faster than Squeak version).

As an example of the style of code I've written, here is the method
that reads an unsigned 32-bit integer:

uint32
" returns the next unsigned, 32-bit integer from the binary stream "
" see PositionableStream for original implimentation."
| n a b c d |
isBigEndian
ifTrue:
[ a := stream next.
b := stream next.
c := stream next.
d := stream next ]
ifFalse:
[ d := stream next.
c := stream next.
b := stream next.
a := stream next ].
((((a notNil and: [ b notNil ]) and: [ c notNil ])) and: [ d notNil])
ifTrue:
[ n := a.
n := (n bitShift: 8) + b.
n := (n bitShift: 8) + c.
n := (n bitShift: 8) + d ]
ifFalse: [ n := nil ].
^ n

There are at 4 calls to stream next for each integer and sure enough,
a profile of the code (attached below) shows that most of the time is
being lost in the StandardFileStream basicNext and next methods. There
must be a better way to do this. Scaled up to operational code, I will
need to process about 40 Gb of data per day. My C code currently takes
about 16 cpu hours to do this work (including number crunching). In
Squeak, just reading the data would take 3 cpu months!

Hopefully, someone can help me out here. The working code is available
on squeaksource.org if anyone is interested:

http://www.squeaksource.com/@CWlm_vX4hAPUzk5w/7SVjQQhp

Thanks,

David

Below is a message tally of my program:

- 166088 tallies, 166100 msec.

**Tree**
100.0% {166100ms} SEAFileReader>>printAllBlocks
99.9% {165934ms} ProcessedPingBlock>>readFrom:
99.9% {165934ms} XYZAPingData>>readFrom:
99.7% {165602ms} XYZATransducerData>>readFrom:
95.9% {159290ms} XYZAPointData>>readFrom:
46.4% {77070ms} BinaryStreamReader>>double
|41.9% {69596ms} BinaryStreamReader>>uint32
| |28.1% {46674ms} StandardFileStream>>next
| | |14.1% {23420ms} primitives
| | |14.0% {23254ms} StandardFileStream>>basicNext
| |9.8% {16278ms} LargePositiveInteger>>+
| | |6.1% {10132ms} LargePositiveInteger(Integer)>>+
| | | |3.1% {5149ms} primitives
| | | |3.0% {4983ms} SmallInteger(Number)>>negative
| | |3.7% {6146ms} primitives
| |4.1% {6810ms} primitives
|2.5% {4153ms} Float class(Behavior)>>new:
|2.0% {3322ms} primitives
13.9% {23088ms} BinaryStreamReader>>float
|10.4% {17274ms} BinaryStreamReader>>uint32
| |7.0% {11627ms} StandardFileStream>>next
| | |3.5% {5814ms} primitives
| | |3.5% {5814ms} StandardFileStream>>basicNext
| |2.4% {3986ms} LargePositiveInteger>>+
|2.2% {3654ms} Float class>>fromIEEE32Bit:
13.7% {22756ms} BinaryStreamReader>>int32
|7.7% {12790ms} BinaryStreamReader>>uint32
| |6.8% {11295ms} StandardFileStream>>next
| | 3.5% {5814ms} StandardFileStream>>basicNext
| | 3.4% {5647ms} primitives
|5.2% {8637ms} SmallInteger>>>=
| 4.3% {7142ms} SmallInteger(Magnitude)>>>=
| 3.5% {5814ms} SmallInteger>><
| 2.6% {4319ms} SmallInteger(Integer)>><
10.7% {17773ms} BinaryStreamReader>>uint16
|6.9% {11461ms} StandardFileStream>>next
| |3.5% {5814ms} StandardFileStream>>basicNext
| |3.3% {5481ms} primitives
|3.8% {6312ms} primitives
6.8% {11295ms} BinaryStreamReader>>skip:
|5.0% {8305ms} StandardFileStream>>skip:
3.4% {5647ms} BinaryStreamReader>>int8
2.6% {4319ms} BinaryStreamReader>>uint8

**Leaves**
25.4% {42189ms} StandardFileStream>>basicNext
25.2% {41857ms} StandardFileStream>>next
6.0% {9966ms} BinaryStreamReader>>uint32
5.6% {9302ms} SmallInteger(Number)>>negative
4.6% {7641ms} LargePositiveInteger>>+
3.8% {6312ms} LargePositiveInteger(Integer)>>+
3.8% {6312ms} BinaryStreamReader>>uint16
3.4% {5647ms} Float class(Behavior)>>new:
2.0% {3322ms} BinaryStreamReader>>double

**Memory**
old +3,705,004 bytes
young -28,800 bytes
used +3,676,204 bytes
free +362,744 bytes

**GCs**
full 50 totalling 2,524ms (2.0% uptime), avg 50.0ms
incr 19959 totalling 2,794ms (2.0% uptime), avg 0.0ms
tenures 6,041 (avg 3 GCs/tenure)
root table 0 overflows

--
David Finlayson, Ph.D.
Operational Geologist

U.S. Geological Survey
Pacific Science Center
400 Natural Bridges Drive
Santa Cruz, CA 95060, USA

Tel: 831-427-4757, Fax: 831-427-4748, E-mail: [hidden email]
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Randal L. Schwartz

Re: Binary file I/O performance problems

>>>>> "David" == David Finlayson <[hidden email]> writes:

David> ((((a notNil and: [ b notNil ]) and: [ c notNil ])) and: [ d notNil])
David> ifTrue:
David> [ n := a.
David> n := (n bitShift: 8) + b.
David> n := (n bitShift: 8) + c.
David> n := (n bitShift: 8) + d ]
David> ifFalse: [ n := nil ].

This screams for an "early answer" assistant method, something like:

computeSomething
a isNil: [^nil].
b isNil: [^nil].
c isNil: [^nil].
d isNil: [^nil].
^(the code with all the bitshifts).

Actually, perhaps even the use of a good detect: would be right here, if you
didn't have a, b, c, d as instvars. In fact, that's much more likely an array
instead of four instvars, which would simplify all the repeated code.

--
Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095
<[hidden email]> <URL:http://www.stonehenge.com/merlyn/>
Smalltalk/Perl/Unix consulting, Technical writing, Comedy, etc. etc.
See http://methodsandmessages.vox.com/ for Smalltalk and Seaside discussion
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Zulq Alam-2

Re: Binary file I/O performance problems

In reply to this post by David Finlayson-4

Hi David,

You could try using stream next: 4 to read the 4 bytes in one go:

[StandardFileStream readOnlyFileNamed: 'Base.image' do:
[:stream |
[stream atEnd] whileFalse:
[stream next.
stream next.
stream next.
stream next.]]] timeToRun
" 328505 "

[StandardFileStream readOnlyFileNamed: 'Base.image' do:
[:stream |
stream binary.
[stream atEnd] whileFalse:
[stream next: 4]]] timeToRun
" 144469 "

If you can, read larger chunks:

[StandardFileStream readOnlyFileNamed: 'Base.image' do:
[:stream |
stream binary.
[stream atEnd] whileFalse:
[stream next: 2048]]] timeToRun
" 343 "

[StandardFileStream readOnlyFileNamed: 'Base.image' do:
[:stream |
stream binary.
[stream atEnd] whileFalse:
[stream next: 2048]]] timeToRun
" 197 "

I'm surprised there isn't a generic class for this, like
java.io.BufferedInputStream. Perhaps I haven't discovered it yet. Anyone?

Regards,
Zulq.

David Finlayson wrote:

> There are at 4 calls to stream next for each integer and sure enough,
> a profile of the code (attached below) shows that most of the time is
> being lost in the StandardFileStream basicNext and next methods. There
> must be a better way to do this. Scaled up to operational code, I will
> need to process about 40 Gb of data per day. My C code currently takes
> about 16 cpu hours to do this work (including number crunching). In
> Squeak, just reading the data would take 3 cpu months!

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

David Finlayson-4

Re: Binary file I/O performance problems

In reply to this post by Randal L. Schwartz

Thanks for the style pointers. I'm a scientist, not a programmer, so
it will be rough going while I learn.

What I wanted was an exception (try/except) in case any of the reads
failed. Corrupt files are an expected case that should be handled by
the program. So I can't crash while reading (or writing). Does Squeak
have exceptions? Or is there a Smalltalk pattern for this "try to
execute this, do something else if it fails"? That answer should
probably go into another thread.

David
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

David Finlayson-4

Re: Binary file I/O performance problems

OK - I made some of the suggested changes. I broke the readers into two parts:

uint32
"returns the next unsigned, 32-bit integer from the binary
stream"
isBigEndian
ifTrue: [^ self nextBigEndianNumber: 4]
ifFalse: [^ self nextLittleEndianNumber: 4]

Where nextLittleEndianNumber looks like this:

nextLittleEndianNumber: n
"Answer the next n bytes as a positive Integer or
LargePositiveInteger, where the bytes are ordered from least
significant to most significant.
Copied from PositionableStream"
| bytes s |
[bytes := stream next: n.
s := 0.
n
to: 1
by: -1
do: [:i | s := (s bitShift: 8)
bitOr: (bytes at: i)].
^ s]
on: Error
do: [^ nil]

This (I think) cleans up some of the code smell, but for only marginal
performance improvements. It seems that I may need to implement a
buffer on the binary stream. Is there a good example on how this
should be done in the image or elsewhere?

I find it troubling that I am having to write code below the
abstraction level of C to read and write data from a file. I thought
Smalltalk was supposed to free me from this kind of drudgery? Right
now, Java looks good and Python/Ruby look fantastic by comparison.

David
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Herbert König

Re: Binary file I/O performance problems

In reply to this post by David Finlayson-4

Hello David,

DF> focuses on just reading a small (for us) 123 Mb data file on disk. The
DF> program takes about 166 seconds to run compared to 1.2 seconds for an
DF> equivalent C version (140x faster than Squeak version).

number crunching and raw speed are not the points where Smalltalk
excels.

0 tinyBenchmarks gives '322824716 bytecodes/sec; 8945704 sends/sec'
which is about 9 million sends on my 1.8 GHz Pentium M.

In the browser when you will switch from Source to Byte codes in the
lowest pane (rightmost button) you will see the many sends in your
code. Some of these code fragments (e.g. the arithmetic) would be a
lot faster in any compiling language.

With this you can estimate the performance you can expect.

If it would only take one send per byte read from the file my Computer
would take about 10 seconds for 100MB.

That's the price for dynamically looking up the receiver's class for
every send.

So I guess this application is better left for other languages.

Cheers,

Herbert mailto:[hidden email]

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Waldemar Schwan-2

moving files on Windows

Hello everyone.

Normaly I'm developing on MacOS 10.5. As I tryed to run my code on a
Windows Vista deleting a file throws me an

CannotDeleteFileException: Coud not delete the old version of file D:
\waldemar\test\movingDestionation\moveMe.txt

Because the error don't tells me why the file can't be deletet I'm
completly stumped. The file is writeable.

What I'm trying to do is to move a file from one folder to another. To
acomplish that I create a readOnlyFileStream on the src-file an force
the destinationdirectory to create a new file named like the src-file.
After that I use FileDirectory>>copyFile: to: .

moveLocalFile: aCBFile3DLocal toMountain: aCBMountain
| srcDir destDir srcFile destFile |
srcDir := aCBFile3DLocal file directory fileDirectory.
destDir := FileDirectory on: aCBMountain path.

srcFile := srcDir readOnlyFileNamed: aCBFile3DLocal file name.
srcFile binary.
destFile := destDir forceNewFileNamed: aCBFile3DLocal file name.
destFile binary.

srcDir copyFile: srcFile toFile: destFile.
srcDir deleteFileNamed: aCBFile3DLocal file name.

Again: This code works on Mac but don't on Windows (Vista) allsow in
compatibility mode.
I hope someone can give me a hint.

Best regards.
Waldemar
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Klaus D. Witzel

Re: Binary file I/O performance problems

In reply to this post by David Finlayson-4

Hi David,

let me respond in "reverse" order of your points:

> I find it troubling that I am having to write code below the
> abstraction level of C to read and write data from a file. I thought
> Smalltalk was supposed to free me from this kind of drudgery? Right
> now, Java looks good and Python/Ruby look fantastic by comparison.

Here the difference to Squeak/Smalltalk is, that the intermediate level
routines like #uint32 are made available at the Smalltalk language level
where users can see them, use them and modify them. Such an approach is
seen as part of an invaluable resource by Smalltalk users. It has a price,
yes.

But Squeak/Smalltalk can do faster, dramatically faster than what you
observed. The .image file (10s - 100s MB) is read from disk and
de-endianessed in a second or so. Of course this is possible only because
the file is in a ready-to-use format, but this can be a clue when you
perhaps want to consider alternative input methods.

> This (I think) cleans up some of the code smell, but for only marginal
> performance improvements. It seems that I may need to implement a
> buffer on the binary stream. Is there a good example on how this
> should be done in the image or elsewhere?

I don't know of a particular example (specialized somehow on your problem
at hand, for buffered reading of arbitrary "struct"s) but this here is
easy to do in Squeak:

byteArray := ByteArray new: 2 << 20.
actuallyTransferred :=
binaryStream readInto: byteArray startingAt: 1 count: byteArray size

You may perhaps want to check that GBs can be brought into Squeak's memory
in a matter of seconds, just #printIt in a workspace:

[1024 timesRepeat: [[
(binaryStream := (SourceFiles at: 1) readOnlyCopy) binary.
byteArray := ByteArray new: 2 << 20.
actuallyTransferred :=
binaryStream reset; readInto:
byteArray startingAt: 1 count: byteArray size]
ensure: [binaryStream close]]] timeToRun

When reading from disk 4-byte-wise this makes a huge difference for sure.
From here on you would use the ByteArray protocol (#byteAt:*, #shortAt:*,
#longAt:*, #doubleAt:*) but as mentioned earlier these methods are perhaps
not optimal (when compared to other languages and their implementation
libraries) for you.

Last but not least, when doing performance critical i/o or conversions,
Squeak users sometimes write a Squeak plugin (which then extends the
Squeak VM), still at the Smalltalk/Slang language level but with it they
can do/call any hw-oriented routine for speeding up things dramatically,
and this indeed compares well to other languages and their implementation
libraries :)

HTH.

/Klaus

On Wed, 03 Sep 2008 08:00:54 +0200, David Finlayson wrote:

> OK - I made some of the suggested changes. I broke the readers into two
> parts:
>
> uint32
> "returns the next unsigned, 32-bit integer from the binary
> stream"
> isBigEndian
> ifTrue: [^ self nextBigEndianNumber: 4]
> ifFalse: [^ self nextLittleEndianNumber: 4]
>
> Where nextLittleEndianNumber looks like this:
>
> nextLittleEndianNumber: n
> "Answer the next n bytes as a positive Integer or
> LargePositiveInteger, where the bytes are ordered from least
> significant to most significant.
> Copied from PositionableStream"
> | bytes s |
> [bytes := stream next: n.
> s := 0.
> n
> to: 1
> by: -1
> do: [:i | s := (s bitShift: 8)
> bitOr: (bytes at: i)].
> ^ s]
> on: Error
> do: [^ nil]
>
>
>
> David

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

David T. Lewis

Re: Binary file I/O performance problems

In reply to this post by David Finlayson-4

On Tue, Sep 02, 2008 at 11:00:54PM -0700, David Finlayson wrote:
>
> I find it troubling that I am having to write code below the
> abstraction level of C to read and write data from a file. I thought
> Smalltalk was supposed to free me from this kind of drudgery?

David,

You're quite right about that. The good news is that you have already
figured out how to profile, and you already know where the performance
problem is. Setting aside for the moment the issue of Squeak's awfile
file I/O performance, the quickest solution to your problem may also
be the easiest. As long as the data sets are not too large, just load
the whole file into Squeak first (use FileStream>>contentsOfEntireFile)
and *then* operate on the data.

For example, if you have data in MYDATA.BIN, and you want to load it
into Squeak and read the first 100 bytes, you can do something like this:

| myFile dataStream |
myFile := FileStream fileNamed: 'MYDATA.BIN'.
[dataStream := ReadStream on: myFile contentsOfEntireFile]
ensure: [myFile ifNotNilDo: [:f | f close]].
dataStream next: 100.

Once you have the data in memory, things are quite fast. I know this
sounds like an odd way to handle data loading, but it actually works
very well, and buying some memory is a whole lot easier than fixing
Squeak's I/O performance ;)

HTH,
Dave

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

David Finlayson-4

Re: Binary file I/O performance problems

I re-wrote the test application to load the test file entirely into
memory before parsing the data. The total time to parse the file
decreased by about 50%. Now that I/O is removed from the picture, the
new bottle neck is turning bytes into integers (and then integers into
Floats).

I know that Smalltalk isn't the common language for number crunching,
but if I can get acceptable performance out of it, then down the road
I would like to tap into the Croquet environment. That is why I am
trying to learn a way that will work.

David
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Matthias Berth-2

Re: Binary file I/O performance problems

David,

How many possible float values do you have? Maybe a lookup strategy
for the conversion is feasible...

Cheers

Matthias

On Fri, Sep 5, 2008 at 7:59 PM, David Finlayson <[hidden email]> wrote:

> I re-wrote the test application to load the test file entirely into
> memory before parsing the data. The total time to parse the file
> decreased by about 50%. Now that I/O is removed from the picture, the
> new bottle neck is turning bytes into integers (and then integers into
> Floats).
>
> David
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

David Finlayson-4

Re: Binary file I/O performance problems

For the most part, these numbers represent instrument measurements
(swath bathymetry from sonar systems). Precision ranges from 5 to 10
significant figures depending on the specific instrument being
recorded. So it wouldn't really be practical to form a look-up table
in most cases.

What attracted me to Squeak was that I was on the boat a few months
ago and got a functional navigation system built (sort-of like a
Garmin console on a pleasure boat) in about 2 days (used morphic and
the UDPSocket stuff)! That was awesome.

Then I modified the sonogram class to display sonar backscatter data
(like a black-and-white image of the sea floor) in about 2 hours. Very
cools stuff. The only problem was that the sonar data is time
consuming to parse in Squeak and so the sonogram scrolled about 1 row
per second (our system is collecting data at 8 pings per second) So it
would take me 8 hours to display 1 hour of sonar data.

The distant dream is to paint the sonar data into a Croquet world in
real time where scientists from other stations on the boat (or maybe
over the internet) can see the data rolling in as we collect it. It
would be really cool. Add in our boat as an icon, an ROV (remotely
operated vehicle) and maybe some in-water targets like fish or
whatever and I bet this would be Slashdot stuff! BUT, I need to be
able to get a handle on the speed of Squeak or this won't be
practical.

Maybe I need to write some kind of filter (pre-amplifier) in a
high-performance language as the data comes in over the network and
then re-broadcasts a decimated data set to Squeak?

David
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Yoshiki Ohshima-2

Re: Binary file I/O performance problems

In reply to this post by David Finlayson-4

At Fri, 5 Sep 2008 10:59:03 -0700,
David Finlayson wrote:

>
> I re-wrote the test application to load the test file entirely into
> memory before parsing the data. The total time to parse the file
> decreased by about 50%. Now that I/O is removed from the picture, the
> new bottle neck is turning bytes into integers (and then integers into
> Floats).
>
> I know that Smalltalk isn't the common language for number crunching,
> but if I can get acceptable performance out of it, then down the road
> I would like to tap into the Croquet environment. That is why I am
> trying to learn a way that will work.

If the integers or floats are in the layout of C's int[] or float[],
there is a better chance to make it much faster.

Look at the method Bitmap>>asByteArray and
Bitmap>>copyFromByteArray:. You can convert a big array of non-pointer
words from/to a byte array.

data := (1 to: 1000000) as: FloatArray.
words := Bitmap new: data size.
words replaceFrom: 1 to: data size with: data.
bytes := words asByteArray.

"and you write out the bytes into a binary file."

"to get them back:"

words copyFromByteArray: bytes.
data replaceFrom: 1 to: words size with: words.

Obviously, you can recycle some of the intermediate buffer allocation
and that would speed it up.

FloatArray has some vector arithmetic primitives, and the Kedama
system in OLPC Etoys image have more elaborated vector arithmetic
primitives on integers and floats including operations with masked
vectors.

-- Yoshiki
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Yoshiki Ohshima-2

Re: Binary file I/O performance problems

In reply to this post by David Finlayson-4

At Fri, 5 Sep 2008 11:33:37 -0700,
David Finlayson wrote:
>
> Then I modified the sonogram class to display sonar backscatter data
> (like a black-and-white image of the sea floor) in about 2 hours. Very
> cools stuff. The only problem was that the sonar data is time
> consuming to parse in Squeak and so the sonogram scrolled about 1 row
> per second (our system is collecting data at 8 pings per second) So it
> would take me 8 hours to display 1 hour of sonar data.

Ah, cool. In the OLPC Etoys image, there is a more efficient
version of Sonogram called WsSonogram, and it is about 2 times faster
than the original, and if you just add a primitive that takes a float
array and calculate the sqrt of all entries and store them into the
array, that will be 4-5 times faster or such. The code is of course
perfectly portable across the platform (i.e., not tied to OLPC) so
probably it might be an interest of you.

> The distant dream is to paint the sonar data into a Croquet world in
> real time where scientists from other stations on the boat (or maybe
> over the internet) can see the data rolling in as we collect it. It
> would be really cool. Add in our boat as an icon, an ROV (remotely
> operated vehicle) and maybe some in-water targets like fish or
> whatever and I bet this would be Slashdot stuff! BUT, I need to be
> able to get a handle on the speed of Squeak or this won't be
> practical.

It could be quite practical with a few extra primitives. One could
of course imagine to utilize GPU. That would be fairly viable.

> Maybe I need to write some kind of filter (pre-amplifier) in a
> high-performance language as the data comes in over the network and
> then re-broadcasts a decimated data set to Squeak?

That could be certainly an option, too.

-- Yoshiki
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Nicolas Cellier-3

Re: Binary file I/O performance problems

In reply to this post by Yoshiki Ohshima-2

Yoshiki Ohshima a écrit :

> At Fri, 5 Sep 2008 10:59:03 -0700,
> David Finlayson wrote:
>> I re-wrote the test application to load the test file entirely into
>> memory before parsing the data. The total time to parse the file
>> decreased by about 50%. Now that I/O is removed from the picture, the
>> new bottle neck is turning bytes into integers (and then integers into
>> Floats).
>>
>> I know that Smalltalk isn't the common language for number crunching,
>> but if I can get acceptable performance out of it, then down the road
>> I would like to tap into the Croquet environment. That is why I am
>> trying to learn a way that will work.
>
> If the integers or floats are in the layout of C's int[] or float[],
> there is a better chance to make it much faster.
>
> Look at the method Bitmap>>asByteArray and
> Bitmap>>copyFromByteArray:. You can convert a big array of non-pointer
> words from/to a byte array.
>
> data := (1 to: 1000000) as: FloatArray.
> words := Bitmap new: data size.
> words replaceFrom: 1 to: data size with: data.
> bytes := words asByteArray.
>
> "and you write out the bytes into a binary file."
>
> "to get them back:"
>
> words copyFromByteArray: bytes.
> data replaceFrom: 1 to: words size with: words.
>
> Obviously, you can recycle some of the intermediate buffer allocation
> and that would speed it up.
>
> FloatArray has some vector arithmetic primitives, and the Kedama
> system in OLPC Etoys image have more elaborated vector arithmetic
> primitives on integers and floats including operations with masked
> vectors.
>
> -- Yoshiki

Hi David,
your applications is exciting my curiosity. Which company/organization
are you working for, if not indiscreet?

I think you will solve most performances problems following good advices
from Yoshiki.

You might also want to investigate FFI as a way for handling
C-layout-like ByteArray memory from within Smalltalk as an alternative.
I made an example of use in Smallapack-Collections (search Smallapack in
squeaksource, http://www.squeaksource.com/Smallapack/) .
ExternalArray is an abstract class for handling memory filled as a
C-Arrays of any type from within Smalltalk (only float double and
complex are programmed in subclasses, but you can extend), and in fact
FFI can handle any structure (though you'll might have to resolve
alignment problems by yourself).
There's a trade-off between fast reading (no conversion) and slower
access (conversion at each access), however with ByteArray>>#doubleAt:
and #floatAt: primitives (from FFI), and fast hacks to eventually
reverse endianness of a whole array at once, maintaining ExternalArrays
of elementary types or small structures procide access time still
reasonnable.

Nicolas

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Nicolas Cellier-3

Re: Binary file I/O performance problems

nicolas cellier a écrit :

> Yoshiki Ohshima a écrit :
>> At Fri, 5 Sep 2008 10:59:03 -0700,
>> David Finlayson wrote:
>>> I re-wrote the test application to load the test file entirely into
>>> memory before parsing the data. The total time to parse the file
>>> decreased by about 50%. Now that I/O is removed from the picture, the
>>> new bottle neck is turning bytes into integers (and then integers into
>>> Floats).
>>>
>>> I know that Smalltalk isn't the common language for number crunching,
>>> but if I can get acceptable performance out of it, then down the road
>>> I would like to tap into the Croquet environment. That is why I am
>>> trying to learn a way that will work.
>>
>> If the integers or floats are in the layout of C's int[] or float[],
>> there is a better chance to make it much faster.
>>
>> Look at the method Bitmap>>asByteArray and
>> Bitmap>>copyFromByteArray:. You can convert a big array of non-pointer
>> words from/to a byte array.
>>
>> data := (1 to: 1000000) as: FloatArray.
>> words := Bitmap new: data size.
>> words replaceFrom: 1 to: data size with: data.
>> bytes := words asByteArray.
>>
>> "and you write out the bytes into a binary file."
>>
>> "to get them back:"
>>
>> words copyFromByteArray: bytes.
>> data replaceFrom: 1 to: words size with: words.
>>
>> Obviously, you can recycle some of the intermediate buffer allocation
>> and that would speed it up.
>>
>> FloatArray has some vector arithmetic primitives, and the Kedama
>> system in OLPC Etoys image have more elaborated vector arithmetic
>> primitives on integers and floats including operations with masked
>> vectors.
>>
>> -- Yoshiki
>
> Hi David,
> your applications is exciting my curiosity. Which company/organization
> are you working for, if not indiscreet?
>
> I think you will solve most performances problems following good advices
> from Yoshiki.
>
> You might also want to investigate FFI as a way for handling
> C-layout-like ByteArray memory from within Smalltalk as an alternative.
> I made an example of use in Smallapack-Collections (search Smallapack in
> squeaksource, http://www.squeaksource.com/Smallapack/) .
> ExternalArray is an abstract class for handling memory filled as a
> C-Arrays of any type from within Smalltalk (only float double and
> complex are programmed in subclasses, but you can extend), and in fact
> FFI can handle any structure (though you'll might have to resolve
> alignment problems by yourself).
> There's a trade-off between fast reading (no conversion) and slower
> access (conversion at each access), however with ByteArray>>#doubleAt:
> and #floatAt: primitives (from FFI), and fast hacks to eventually
> reverse endianness of a whole array at once, maintaining ExternalArrays
> of elementary types or small structures procide access time still
> reasonnable.
>
> Nicolas

forgot to provide some timing (Athlon 32bits 1GHz) for write/read access:

| a b c |
{
[a := FloatArray withAll: (1 to: 100000)] timeToRun.
[b := ExternalFloatArray withAll: (1 to: 100000)] timeToRun.
[c := ExternalDoubleArray withAll: (1 to: 100000)] timeToRun.
[a do: [:e | ]] timeToRun.
[b do: [:e | ]] timeToRun.
[c do: [:e | ]] timeToRun.
}.
#(142 312 335 80 181 182)

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Yoshiki Ohshima-2

Re: Re: Binary file I/O performance problems

In reply to this post by Nicolas Cellier-3

At Fri, 05 Sep 2008 23:00:07 +0200,
nicolas cellier wrote:
>
> Hi David,
> your applications is exciting my curiosity. Which company/organization
> are you working for, if not indiscreet?

I assume the answer is USGS, because of his email address! Yes, it
sounds like something cool is going on.

-- Yoshiki
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

David Finlayson-4

Re: Re: Binary file I/O performance problems

In reply to this post by Nicolas Cellier-3

Unfortunately, the data is not a simple block of floats. For example,
in C here is how I read a "ping" header block from one of our vendors
formats:

/* read_xyza_ping: read ping block, returns 1 if successful, EOF if
* end of file */
int read_xyza_ping(FILE *fin, XYZA_Ping *pp) {
int8_t byte[4];

fread(&pp->linename, sizeof(int8_t), MAX_LINENAME_LEN, fin);
fread(&pp->pingnum, sizeof(uint32_t), 1, fin);
fread(&byte, sizeof(int8_t), 4, fin);
fread(&pp->time, sizeof(double), 1, fin);
fread(&pp->notxers, sizeof(int32_t), 1, fin);
fread(&byte, sizeof(int8_t), 4, fin);
read_posn(fin, &pp->posn);
fread(&pp->roll, sizeof(double), 1, fin);
fread(&pp->pitch, sizeof(double), 1, fin);
fread(&pp->heading, sizeof(double), 1, fin);
fread(&pp->height, sizeof(double), 1, fin);
fread(&pp->tide, sizeof(double), 1, fin);
fread(&pp->sos, sizeof(double), 1, fin);

if (ferror(fin) != 0) {
perror("sxpfile: error: (read_xyza_ping)");
abort();
}

// time between 1995 - 2020?
assert(788936400 < pp->time && pp->time < 1577865600);
assert(0 < pp->notxers && pp->notxers <= MAX_TXERS);
assert(-90.0 < pp->roll && pp->roll < 90.0);
assert(-90.0 < pp->pitch && pp->pitch < 90.0);
assert(0.0 <= pp->heading && pp->heading <= 360.0);

// heave values
assert(-10.0 < pp->height && pp->height < 10.0);
assert(-100 < pp->tide && pp->tide < 100.0);

// speed of sound reasonable? (freshwater too)
assert(1000 <= pp->sos && pp->sos < 1600);

return feof(fin) ? EOF : 1;
}

Note how there are various sized integers and floating point numbers
mixed together along with padding space put into the file during the
write (the original engineer must have just used fwrite on the
structs).

The notxers variable above indicates the number of XYZA_Txer structs
to follow, each XYZA_Txer struct indicates the number of XYZA_Point
structs to follow and so on until the entire structure is read into
memory. Then you start over again and read the next ping.

It is painful, but I don't know how to read any other way except to
read them in one structure at a time.

--
David Finlayson, Ph.D.
Operational Geologist

U.S. Geological Survey
Pacific Science Center
400 Natural Bridges Drive
Santa Cruz, CA 95060, USA

Tel: 831-427-4757, Fax: 831-427-4748, E-mail: [hidden email]
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

David Finlayson-4

Re: Re: Binary file I/O performance problems

In reply to this post by Yoshiki Ohshima-2

Coastal and marine geology, USGS. But this isn't an official project.
Just a pipe dream of mine right now. I am not even sure I am competent
enough to pull it off by myself. However, I figure the best way to get
support for this is to build a semi-working prototype and then show it
off and see what happens.

I do wish Cog were further along though. Without Croquet, VW isn't
really an option. I don't know if other languages support the 3D
collaboration that Croquet promises. Meanwhile, I need to learn more
Smalltalk.

David
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners

Yoshiki Ohshima-2

Re: Re: Binary file I/O performance problems

In reply to this post by David Finlayson-4

At Fri, 5 Sep 2008 14:49:29 -0700,
David Finlayson wrote:
>
> Unfortunately, the data is not a simple block of floats. For example,
> in C here is how I read a "ping" header block from one of our vendors
> formats:

I'm sure that there are other implications, but it sounds like you
do need some primitives to make it efficient. I would make a
primitive that is equivalent of read_xyza_ping() that fills a Squeak
object, or if you are dealing with array of XYZA_Ping structure,
making an array of homogeneous arrays so that all linenames are stored
in a ByteArray, all pingnums are stored in a WordArray, etc. In this
way, you may still be able to utilize the vector primitives.

-- Yoshiki
_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners