The Trunk: Collections-tfel.623.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

The Trunk: Collections-tfel.623.mcz

commits-2
Tim Felgentreff uploaded a new version of Collections to project The Trunk:
http://source.squeak.org/trunk/Collections-tfel.623.mcz

==================== Summary ====================

Name: Collections-tfel.623
Author: tfel
Time: 1 May 2015, 11:06:35.625 am
UUID: baf2902d-52b9-2442-a513-8a5a6ff0ce30
Ancestors: Collections-nice.622

fix fallback code for ByteString>>findSubstring:in:startingAt:matchTable: when passing a starting index <= 0

=============== Diff against Collections-nice.622 ===============

Item was changed:
  ----- Method: ByteString>>findSubstring:in:startingAt:matchTable: (in category 'comparing') -----
  findSubstring: key in: body startingAt: start matchTable: matchTable
  "Answer the index in the string body at which the substring key first occurs, at or beyond start.  The match is determined using matchTable, which can be used to effect, eg, case-insensitive matches.  If no match is found, zero will be returned.
 
  The algorithm below is not optimum -- it is intended to be translated to C which will go so fast that it wont matter."
  | index |
  <primitive: 'primitiveFindSubstring' module: 'MiscPrimitivePlugin'>
  <var: #key declareC: 'unsigned char *key'>
  <var: #body declareC: 'unsigned char *body'>
  <var: #matchTable declareC: 'unsigned char *matchTable'>
 
  key size = 0 ifTrue: [^ 0].
+ (start max: 1) to: body size - key size + 1 do:
- start to: body size - key size + 1 do:
  [:startIndex |
  index := 1.
  [(matchTable at: (body at: startIndex+index-1) asciiValue + 1)
  = (matchTable at: (key at: index) asciiValue + 1)]
  whileTrue:
  [index = key size ifTrue: [^ startIndex].
  index := index+1]].
  ^ 0
  "
  ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 1 matchTable: CaseSensitiveOrder 1
  ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 2 matchTable: CaseSensitiveOrder 7
  ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 8 matchTable: CaseSensitiveOrder 0
  ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: CaseSensitiveOrder 0
  ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: CaseInsensitiveOrder 7
  "!


Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-tfel.623.mcz

Chris Muller-3
I know you are making the Smalltalk code behave teh same as the
primitive but did you consider making the primitive behave the same as
the Smalltalk code and signal an error?

I dunno, if an app is going to calculate a starting position to search
a String and gets it wrong, this code will quietly produce erroneous
results.  Such is the type of bug that could go unnoticed for a long
time.  This seems like one of those things that is low-level enough
that an error should be signaled..

On Fri, May 1, 2015 at 4:06 AM,  <[hidden email]> wrote:

> Tim Felgentreff uploaded a new version of Collections to project The Trunk:
> http://source.squeak.org/trunk/Collections-tfel.623.mcz
>
> ==================== Summary ====================
>
> Name: Collections-tfel.623
> Author: tfel
> Time: 1 May 2015, 11:06:35.625 am
> UUID: baf2902d-52b9-2442-a513-8a5a6ff0ce30
> Ancestors: Collections-nice.622
>
> fix fallback code for ByteString>>findSubstring:in:startingAt:matchTable: when passing a starting index <= 0
>
> =============== Diff against Collections-nice.622 ===============
>
> Item was changed:
>   ----- Method: ByteString>>findSubstring:in:startingAt:matchTable: (in category 'comparing') -----
>   findSubstring: key in: body startingAt: start matchTable: matchTable
>         "Answer the index in the string body at which the substring key first occurs, at or beyond start.  The match is determined using matchTable, which can be used to effect, eg, case-insensitive matches.  If no match is found, zero will be returned.
>
>         The algorithm below is not optimum -- it is intended to be translated to C which will go so fast that it wont matter."
>         | index |
>         <primitive: 'primitiveFindSubstring' module: 'MiscPrimitivePlugin'>
>         <var: #key declareC: 'unsigned char *key'>
>         <var: #body declareC: 'unsigned char *body'>
>         <var: #matchTable declareC: 'unsigned char *matchTable'>
>
>         key size = 0 ifTrue: [^ 0].
> +       (start max: 1) to: body size - key size + 1 do:
> -       start to: body size - key size + 1 do:
>                 [:startIndex |
>                 index := 1.
>                         [(matchTable at: (body at: startIndex+index-1) asciiValue + 1)
>                                 = (matchTable at: (key at: index) asciiValue + 1)]
>                                 whileTrue:
>                                 [index = key size ifTrue: [^ startIndex].
>                                 index := index+1]].
>         ^ 0
>   "
>   ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 1 matchTable: CaseSensitiveOrder 1
>   ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 2 matchTable: CaseSensitiveOrder 7
>   ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 8 matchTable: CaseSensitiveOrder 0
>   ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: CaseSensitiveOrder 0
>   ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: CaseInsensitiveOrder 7
>   "!
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-tfel.623.mcz

marcel.taeumel (old)
The thing is that the primitive accepts indices < 1 has starting position. All projects that rely on this will fails if that primitive is not present. So -- and only for strings -- I guess it is fair to simulate the actual behavior of the primtive in case of a primtive-fail for strings only.

So, this change will not harm existing projects but ensures that those keep on working if the primitive is not existing. Before that update, those projects failed unexpectedly.

Best,
Marcel
Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-tfel.623.mcz

Chris Muller-3
On Fri, May 1, 2015 at 9:39 AM, Marcel Taeumel
<[hidden email]> wrote:
> The thing is that the primitive accepts indices < 1 has starting position.
> All projects that rely on this will fails if that primitive is not present.
> So -- and only for strings -- I guess it is fair to simulate the actual
> behavior of the primtive in case of a primtive-fail for strings only.
>
> So, this change will not harm existing projects but ensures that those keep
> on working if the primitive is not existing. Before that update, those
> projects failed unexpectedly.

Right, but that wasn't my question.

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-tfel.623.mcz

marcel.taeumel (old)
If we change the primitive and signal an error, existing projects will break. It's a tricky situation. :)

Best,
Marcel
Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-tfel.623.mcz

timfelgentreff
I agree with Marcel - we even have code in the trunk image that passes negative indices. I would also, in the long run, like to fix the primitive to not accept negative indices. But I don't like to break people's stuff too quickly, so I think we should have a deprecation warning for the current behavior before fixing it.
Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-tfel.623.mcz

David T. Lewis
On Sat, May 02, 2015 at 01:48:15AM -0700, timfelgentreff wrote:
> I agree with Marcel - we even have code in the trunk image that passes
> negative indices. I would also, in the long run, like to fix the primitive
> to not accept negative indices. But I don't like to break people's stuff too
> quickly, so I think we should have a deprecation warning for the current
> behavior before fixing it.
>

+1

I might add that it is good practice for the image to avoid unnecessary
assumptions about the VM. The image may want to run on a variety of VMs
implemented in different languages, running on different platforms, and
featuring different bugs.

But of course the VMs should all be perfect, bug free, and 100% consistent ;-)

Dave

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-tfel.623.mcz

timrowledge

On 02-05-2015, at 5:59 AM, David T. Lewis <[hidden email]> wrote:
> But of course the VMs should all be perfect, bug free, and 100% consistent ;-)

They virtually are...

tim
--
tim Rowledge; [hidden email]; http://www.rowledge.org/tim
You never really learn to swear until you learn to drive in Silicon Valley



Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-tfel.623.mcz

Levente Uzonyi-2
In reply to this post by commits-2
On Fri, 1 May 2015, [hidden email] wrote:

> Tim Felgentreff uploaded a new version of Collections to project The Trunk:
> http://source.squeak.org/trunk/Collections-tfel.623.mcz
>
> ==================== Summary ====================
>
> Name: Collections-tfel.623
> Author: tfel
> Time: 1 May 2015, 11:06:35.625 am
> UUID: baf2902d-52b9-2442-a513-8a5a6ff0ce30
> Ancestors: Collections-nice.622
>
> fix fallback code for ByteString>>findSubstring:in:startingAt:matchTable: when passing a starting index <= 0

If I'm not mistaken, this is the slang code that gets compiled to C. Does
slang support #max:?

Levente

>
> =============== Diff against Collections-nice.622 ===============
>
> Item was changed:
>  ----- Method: ByteString>>findSubstring:in:startingAt:matchTable: (in category 'comparing') -----
>  findSubstring: key in: body startingAt: start matchTable: matchTable
>   "Answer the index in the string body at which the substring key first occurs, at or beyond start.  The match is determined using matchTable, which can be used to effect, eg, case-insensitive matches.  If no match is found, zero will be returned.
>
>   The algorithm below is not optimum -- it is intended to be translated to C which will go so fast that it wont matter."
>   | index |
>   <primitive: 'primitiveFindSubstring' module: 'MiscPrimitivePlugin'>
>   <var: #key declareC: 'unsigned char *key'>
>   <var: #body declareC: 'unsigned char *body'>
>   <var: #matchTable declareC: 'unsigned char *matchTable'>
>
>   key size = 0 ifTrue: [^ 0].
> + (start max: 1) to: body size - key size + 1 do:
> - start to: body size - key size + 1 do:
>   [:startIndex |
>   index := 1.
>   [(matchTable at: (body at: startIndex+index-1) asciiValue + 1)
>   = (matchTable at: (key at: index) asciiValue + 1)]
>   whileTrue:
>   [index = key size ifTrue: [^ startIndex].
>   index := index+1]].
>   ^ 0
>  "
>  ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 1 matchTable: CaseSensitiveOrder 1
>  ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 2 matchTable: CaseSensitiveOrder 7
>  ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 8 matchTable: CaseSensitiveOrder 0
>  ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: CaseSensitiveOrder 0
>  ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: CaseInsensitiveOrder 7
>  "!
>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: The Trunk: Collections-tfel.623.mcz

Eliot Miranda-2
Hi Levente,

On May 3, 2015, at 5:12 AM, Levente Uzonyi <[hidden email]> wrote:

> On Fri, 1 May 2015, [hidden email] wrote:
>
>> Tim Felgentreff uploaded a new version of Collections to project The Trunk:
>> http://source.squeak.org/trunk/Collections-tfel.623.mcz
>>
>> ==================== Summary ====================
>>
>> Name: Collections-tfel.623
>> Author: tfel
>> Time: 1 May 2015, 11:06:35.625 am
>> UUID: baf2902d-52b9-2442-a513-8a5a6ff0ce30
>> Ancestors: Collections-nice.622
>>
>> fix fallback code for ByteString>>findSubstring:in:startingAt:matchTable: when passing a starting index <= 0
>
> If I'm not mistaken, this is the slang code that gets compiled to C. Does slang support #max:?

Yes

>
> Levente
>
>>
>> =============== Diff against Collections-nice.622 ===============
>>
>> Item was changed:
>> ----- Method: ByteString>>findSubstring:in:startingAt:matchTable: (in category 'comparing') -----
>> findSubstring: key in: body startingAt: start matchTable: matchTable
>>    "Answer the index in the string body at which the substring key first occurs, at or beyond start.  The match is determined using matchTable, which can be used to effect, eg, case-insensitive matches.  If no match is found, zero will be returned.
>>
>>    The algorithm below is not optimum -- it is intended to be translated to C which will go so fast that it wont matter."
>>    | index |
>>    <primitive: 'primitiveFindSubstring' module: 'MiscPrimitivePlugin'>
>>    <var: #key declareC: 'unsigned char *key'>
>>    <var: #body declareC: 'unsigned char *body'>
>>    <var: #matchTable declareC: 'unsigned char *matchTable'>
>>
>>    key size = 0 ifTrue: [^ 0].
>> +    (start max: 1) to: body size - key size + 1 do:
>> -    start to: body size - key size + 1 do:
>>        [:startIndex |
>>        index := 1.
>>            [(matchTable at: (body at: startIndex+index-1) asciiValue + 1)
>>                = (matchTable at: (key at: index) asciiValue + 1)]
>>                whileTrue:
>>                [index = key size ifTrue: [^ startIndex].
>>                index := index+1]].
>>    ^ 0
>> "
>> ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 1 matchTable: CaseSensitiveOrder 1
>> ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 2 matchTable: CaseSensitiveOrder 7
>> ' ' findSubstring: 'abc' in: 'abcdefabcd' startingAt: 8 matchTable: CaseSensitiveOrder 0
>> ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: CaseSensitiveOrder 0
>> ' ' findSubstring: 'abc' in: 'abcdefABcd' startingAt: 2 matchTable: CaseInsensitiveOrder 7
>> "!
>