Chris
"Chris Uppal" <
[hidden email]> wrote in message
news:
[hidden email]...
> [I was plain wrong with the last of these I posted, but I'll risk another
> one anyway]
>
> I think the implementation of SequenceableCollection>>beginsWith: is a bit
> wonkey. It's based on #indexOfSubCollection:startingAt: answering 1, but
> what that means is that if the receiver *doesn't* start with the prefix,
> then it takes time proportional to the length of the receiver to find out.
>...
Thanks, that is a bit daft. Defect # 251.
>
> BTW, I and others have sent a few posts about sub-optimal implementations
of
> algorithms lately. It's nice to be able to balance it with a more
positive
> note; I've just had to create a specialised file scanner not unlike the
> Windows find-in-files feature -- my Dolphin version is faster than the
> Windows one, even though I haven't worked out how to do overlapped disk IO
> yet (is this even possible -- I'm thinking of reading file data in one
> thread and scanning it in another ?).
It is possible, but I'd be concerned that the thread synchronisation
overhead might overwhelm the gains. An approach which might be very fast
would be to use memory mapped files - that way one could treat each file as
a byte array and scan it with fast primitives such as the Boyer-Moore
algorithm behind String>>indexOfSubCollection:startingAt: (primitive 149).
Leaving the OS to load the file into memory by demand paging is both simple
and probably more efficient. Unfortunately though, while the primitive
itself works just as well for ByteArrays, it doesn't handle
<ExternalAddress>s. The underlying stringSearch() function could be exported
from the VM (if anyone is interested) and called through the FFI mechanism
to search arbitrary blocks of memory very efficiently.
Regards
Blair