Smalltalk › Squeak › Squeak - Dev

[squeak-dev] The Trunk: Collections-rkrk.117.mcz

_‹ Previous Topic Next Topic _›

Classic

List

Threaded

1 message

commits-2

[squeak-dev] The Trunk: Collections-rkrk.117.mcz

Andreas Raab uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-rkrk.117.mcz

==================== Summary ====================

Name: Collections-rkrk.117
Author: rkrk
Time: 23 August 2009, 2:17:21 am
UUID: ed4258b2-0264-43e2-8adb-4e69cfdf1722
Ancestors: Collections-nice.116

Bugfix for bug #0006579 reported by Nicolas Cellier. In In String>>findLastOccuranceOfString:startingAt: "subString size" is replaced with "1". CollectionsTests-rkrk.90 has a test for this method.

=============== Diff against Collections-nice.116 ===============

Item was changed:
Array weakSubclass: #WeakArray
instanceVariableNames: ''
+ classVariableNames: 'FinalizationDependents FinalizationLock FinalizationProcess FinalizationSemaphore IsFinalizationSupported'
- classVariableNames: 'FinalizationDependents FinalizationProcess FinalizationSemaphore FinalizationLock IsFinalizationSupported'
poolDictionaries: ''
category: 'Collections-Weak'!

!WeakArray commentStamp: '<historical>' prior: 0!
WeakArray is an array which holds only weakly on its elements. This means whenever an object is only referenced by instances of WeakArray it will be garbage collected.!

Item was changed:
String subclass: #Symbol
instanceVariableNames: ''
+ classVariableNames: 'NewSymbols OneCharacterSymbols SymbolTable'
- classVariableNames: 'SymbolTable OneCharacterSymbols NewSymbols'
poolDictionaries: ''
category: 'Collections-Strings'!

!Symbol commentStamp: '<historical>' prior: 0!
I represent Strings that are created uniquely. Thus, someString asSymbol == someString asSymbol.!

Item was changed:
MimeConverter subclass: #Base64MimeConverter
instanceVariableNames: 'data'
+ classVariableNames: 'FromCharTable ToCharTable'
- classVariableNames: 'ToCharTable FromCharTable'
poolDictionaries: ''
category: 'Collections-Streams'!

!Base64MimeConverter commentStamp: '<historical>' prior: 0!
This class encodes and decodes data in Base64 format. This is MIME encoding. We translate a whole stream at once, taking a Stream as input and giving one as output. Returns a whole stream for the caller to use.
0 A 17 R 34 i 51 z
1 B 18 S 35 j 52 0
2 C 19 T 36 k 53 1
3 D 20 U 37 l 54 2
4 E 21 V 38 m 55 3
5 F 22 W 39 n 56 4
6 G 23 X 40 o 57 5
7 H 24 Y 41 p 58 6
8 I 25 Z 42 q 59 7
9 J 26 a 43 r 60 8
10 K 27 b 44 s 61 9
11 L 28 c 45 t 62 +
12 M 29 d 46 u 63 /
13 N 30 e 47 v
14 O 31 f 48 w (pad) =
15 P 32 g 49 x
16 Q 33 h 50 y
Outbound: bytes are broken into 6 bit chunks, and the 0-63 value is converted to a character. 3 data bytes go into 4 characters.
Inbound: Characters are translated in to 0-63 values and shifted into 8 bit bytes.

(See: N. Borenstein, Bellcore, N. Freed, Innosoft, Network Working Group, Request for Comments: RFC 1521, September 1993, MIME (Multipurpose Internet Mail Extensions) Part One: Mechanisms for Specifying and Describing the Format of Internet Message Bodies. Sec 6.2)

By Ted Kaehler, based on Tim Olson's Base64Filter.!

Item was changed:
Magnitude subclass: #Character
instanceVariableNames: 'value'
+ classVariableNames: 'CharacterTable ClassificationTable LetterBits LowercaseBit UppercaseBit'
- classVariableNames: 'ClassificationTable CharacterTable UppercaseBit LowercaseBit LetterBits'
poolDictionaries: ''
category: 'Collections-Strings'!

!Character commentStamp: 'ar 4/9/2005 22:35' prior: 0!
I represent a character by storing its associated Unicode. The first 256 characters are created uniquely, so that all instances of latin1 characters ($R, for example) are identical.

The code point is based on Unicode. Since Unicode is 21-bit wide character set, we have several bits available for other information. As the Unicode Standard states, a Unicode code point doesn't carry the language information. This is going to be a problem with the languages so called CJK (Chinese, Japanese, Korean. Or often CJKV including Vietnamese). Since the characters of those languages are unified and given the same code point, it is impossible to display a bare Unicode code point in an inspector or such tools. To utilize the extra available bits, we use them for identifying the languages. Since the old implementation uses the bits to identify the character encoding, the bits are sometimes called "encoding tag" or neutrally "leading char", but the bits rigidly denotes the concept of languages.

The other languages can have the language tag if you like. This will help to break the large default font (font set) into separately loadable chunk of fonts. However, it is open to the each native speakers and writers to decide how to define the character equality, since the same Unicode code point may have different language tag thus simple #= comparison may return false.

I represent a character by storing its associated ASCII code (extended to 256 codes). My instances are created uniquely, so that all instances of a character ($R, for example) are identical.!

Item was changed:
ArrayedCollection subclass: #String
instanceVariableNames: ''
+ classVariableNames: 'AsciiOrder CSLineEnders CSNonSeparators CSSeparators CaseInsensitiveOrder CaseSensitiveOrder HtmlEntities LowercasingTable Tokenish UppercasingTable'
- classVariableNames: 'CSNonSeparators CaseInsensitiveOrder AsciiOrder HtmlEntities Tokenish LowercasingTable UppercasingTable CSSeparators CSLineEnders CaseSensitiveOrder'
poolDictionaries: ''
category: 'Collections-Strings'!

!String commentStamp: '<historical>' prior: 0!
A String is an indexed collection of Characters. Class String provides the abstract super class for ByteString (that represents an array of 8-bit Characters) and WideString (that represents an array of 32-bit characters). In the similar manner of LargeInteger and SmallInteger, those subclasses are chosen accordingly for a string; namely as long as the system can figure out so, the String is used to represent the given string.

Strings support a vast array of useful methods, which can best be learned by browsing and trying out examples as you find them in the code.

Here are a few useful methods to look at...
String match:
String contractTo:

String also inherits many useful methods from its hierarchy, such as
SequenceableCollection ,
SequenceableCollection copyReplaceAll:with:
!

Item was changed:
----- Method: String>>findLastOccuranceOfString:startingAt: (in category 'accessing') -----
findLastOccuranceOfString: subString startingAt: start
"Answer the index of the last occurance of subString within the receiver, starting at start. If
the receiver does not contain subString, answer 0."

| last now |
last := self findSubstring: subString in: self startingAt: start matchTable: CaseSensitiveOrder.
last = 0 ifTrue: [^ 0].
[last > 0] whileTrue: [
now := last.
+ last := self findSubstring: subString in: self startingAt: last + 1 matchTable: CaseSensitiveOrder.
- last := self findSubstring: subString in: self startingAt: last + subString size matchTable: CaseSensitiveOrder.
].

^ now.
!