Slow object printOn: with EURO symbol

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
11 messages Options
Reply | Threaded
Open this post in threaded view
|

Slow object printOn: with EURO symbol

HilaireFernandes
Hello,

With Pharo3, in an object of mine, I want its text representation to
come with the EURO symbol:

CGMoney>>printOn: aStream
        aStream << (amount asScaledDecimal: 2) greaseString
                << ' ' << '€'


It appears to be very slow when I browse such collection of objects
(says 1500) from an inspector

Replacing '€' by 'EUR' gives back normal rendering time.

Indeed the '€' is encoded as WideString, so not sure what is happening.
All content converted to WideString or what?


The issue was discussed in the Seaside mailing list and different
suggestions raised as well benchmarks exposing the issue:
http://forum.world.st/TableReport-slow-down-tp4868925.html

I tested these benchs en Pharo 3 and 4 and the performance diagnostics
are the same.

There are matching reports to some extent:
https://pharo.fogbugz.com/f/cases/6639/Buffered-Text-Converters
https://pharo.fogbugz.com/f/cases/14279/EyeInspector-very-slow-on-WideString
https://pharo.fogbugz.com/f/cases/10868/Slow-UTF8-decoding

Nevertheless, the problem of the performance remains.

Thanks

Hilaire

--
Dr. Geo
http://drgeo.eu
http://google.com/+DrgeoEu



Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

Henrik Sperre Johansen

> On 02 Jan 2016, at 10:58 , Hilaire <[hidden email]> wrote:
>
> Hello,
>
> With Pharo3, in an object of mine, I want its text representation to
> come with the EURO symbol:
>
> CGMoney>>printOn: aStream
> aStream << (amount asScaledDecimal: 2) greaseString
> << ' ' << '€'
>
>
> It appears to be very slow when I browse such collection of objects
> (says 1500) from an inspector
>
> Replacing '€' by 'EUR' gives back normal rendering time.
>
> Indeed the '€' is encoded as WideString, so not sure what is happening.
> All content converted to WideString or what?

In the fallback code for WriteStream >> #nextPut:, at:put: is called,  so yes, streaming a wide char causes the streams collection to be converted from Byte to WideString.
Conversion is done using become, which currently triggers a full heap scan for references, and is thus very slow.
One could add a fast-path along the lines of #pastEndPut: (which has already broken any assumption that a reference to the collection will reflect all writes for the lifetime of stream, for the same performance problems one would face using #become:); if collection is a ByteString and anObject is a wide characters, replace collection with a WideString, and *then* call at:put:
But, it is not a very nice thing to add to a generic streaming class, nor is it a very attractive at this point in time considering that making become: a fast operation is one of the problems solved by Spur.

Cheers,
Henry

signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

HilaireFernandes
Le 04/01/2016 11:05, Henrik Johansen a écrit :
> In the fallback code for WriteStream >> #nextPut:, at:put: is called,  so yes, streaming a wide char causes the streams collection to be converted from Byte to WideString.
> Conversion is done using become, which currently triggers a full heap scan for references, and is thus very slow.
> One could add a fast-path along the lines of #pastEndPut: (which has already broken any assumption that a reference to the collection will reflect all writes for the lifetime of stream, for the same performance problems one would face using #become:); if collection is a ByteString and anObject is a wide characters, replace collection with a WideString, and *then* call at:put:
> But, it is not a very nice thing to add to a generic streaming class, nor is it a very attractive at this point in time considering that making become: a fast operation is one of the problems solved by Spur.
So wait and see for Spur?
To not forget about it, it is recorded here, and it should be kept open
for later check:
https://pharo.fogbugz.com/f/cases/17315/Slow-object-printOn-with-EURO-symbol

It is  possible to turn around this problem, but this sort of annoyance
with Pharo internal encoding regularly arises, so I am not sure what to
think about the state of Pharo regarding internal encoding. Now days is
not supposed to be all utf-8?

Thanks

Hilaire

--
Dr. Geo
http://drgeo.eu



Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

Sven Van Caekenberghe-2
String with ByteString and WideString subclasses has been a standard feature of Squeak/Pharo for a long time. The transparent automatic conversion between the two is a feature, not a limitation.

In se, there is nothing wrong with it.

Yes, other representations of Strings are possible, but is is far from sure that they would be faster overall. The current implementation favours Latin1 (and thus ASCII), because that is so common. In my work image I count them as follows:

ByteString allInstances size. "301498"

WideString allInstances size. "136"

That is less than 0.05%.

> On 05 Jan 2016, at 13:08, Hilaire <[hidden email]> wrote:
>
> Le 04/01/2016 11:05, Henrik Johansen a écrit :
>> In the fallback code for WriteStream >> #nextPut:, at:put: is called,  so yes, streaming a wide char causes the streams collection to be converted from Byte to WideString.
>> Conversion is done using become, which currently triggers a full heap scan for references, and is thus very slow.
>> One could add a fast-path along the lines of #pastEndPut: (which has already broken any assumption that a reference to the collection will reflect all writes for the lifetime of stream, for the same performance problems one would face using #become:); if collection is a ByteString and anObject is a wide characters, replace collection with a WideString, and *then* call at:put:
>> But, it is not a very nice thing to add to a generic streaming class, nor is it a very attractive at this point in time considering that making become: a fast operation is one of the problems solved by Spur.
> So wait and see for Spur?
> To not forget about it, it is recorded here, and it should be kept open
> for later check:
> https://pharo.fogbugz.com/f/cases/17315/Slow-object-printOn-with-EURO-symbol
>
> It is  possible to turn around this problem, but this sort of annoyance
> with Pharo internal encoding regularly arises, so I am not sure what to
> think about the state of Pharo regarding internal encoding. Now days is
> not supposed to be all utf-8?
>
> Thanks
>
> Hilaire
>
> --
> Dr. Geo
> http://drgeo.eu
>
>
>


Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

HilaireFernandes
Agree.
Nevertheless, the performance impact I got on OOP way of doing things
can't be wiped out: when you ask an object to give its text
representation but use non latin1 character, you get an important
penalty. In the long term, it looks like a problem for Pharo.

Hilaire


Le 05/01/2016 13:20, Sven Van Caekenberghe a écrit :

> 0450822759String with ByteString and WideString subclasses has been a standard feature of Squeak/Pharo for a long time. The transparent automatic conversion between the two is a feature, not a limitation.
>
> In se, there is nothing wrong with it.
>
> Yes, other representations of Strings are possible, but is is far from sure that they would be faster overall. The current implementation favours Latin1 (and thus ASCII), because that is so common. In my work image I count them as follows:
>
> ByteString allInstances size. "301498"
>
> WideString allInstances size. "136"
>
> That is less than 0.05%.
>
>> On 05 Jan 2016, at 13:08, Hilaire <[hidden email]> wrote:
>>
>> Le 04/01/2016 11:05, Henrik Johansen a écrit :
>>> In the fallback code for WriteStream >> #nextPut:, at:put: is called,  so yes, streaming a wide char causes the streams collection to be converted from Byte to WideString.
>>> Conversion is done using become, which currently triggers a full heap scan for references, and is thus very slow.
>>> One could add a fast-path along the lines of #pastEndPut: (which has already broken any assumption that a reference to the collection will reflect all writes for the lifetime of stream, for the same performance problems one would face using #become:); if collection is a ByteString and anObject is a wide characters, replace collection with a WideString, and *then* call at:put:
>>> But, it is not a very nice thing to add to a generic streaming class, nor is it a very attractive at this point in time considering that making become: a fast operation is one of the problems solved by Spur.
>> So wait and see for Spur?
>> To not forget about it, it is recorded here, and it should be kept open
>> for later check:
>> https://pharo.fogbugz.com/f/cases/17315/Slow-object-printOn-with-EURO-symbol
>>
>> It is  possible to turn around this problem, but this sort of annoyance
>> with Pharo internal encoding regularly arises, so I am not sure what to
>> think about the state of Pharo regarding internal encoding. Now days is
>> not supposed to be all utf-8?
>>
>> Thanks
>>
>> Hilaire
>>
>> --
>> Dr. Geo
>> http://drgeo.eu
>>
>>
>>
>
>


--
Dr. Geo
http://drgeo.eu



Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

Sven Van Caekenberghe-2

> On 05 Jan 2016, at 14:26, Hilaire <[hidden email]> wrote:
>
> Agree.
> Nevertheless, the performance impact I got on OOP way of doing things
> can't be wiped out: when you ask an object to give its text
> representation but use non latin1 character, you get an important
> penalty. In the long term, it looks like a problem for Pharo.

No, that is not 100% correct.

You can use any Unicode anywhere tranparantly and the performance penalty is low. Pharo supports Unicode everywhere for 100% (given you use the right font).

The problem occurs only when you take a collection of 1000s of these objects in a tool that wants to convert them all at once but separately to strings. Then the cumulative performance penalty becomes quite noticeable, true.

The problem can does also be restated: it is really necessary for a tool to convert 1000s of items to strings, even if only 10s are shown at the same time on a screen ?

I believe that fast table tries to do better here.

> Hilaire
>
>
> Le 05/01/2016 13:20, Sven Van Caekenberghe a écrit :
>> 0450822759String with ByteString and WideString subclasses has been a standard feature of Squeak/Pharo for a long time. The transparent automatic conversion between the two is a feature, not a limitation.
>>
>> In se, there is nothing wrong with it.
>>
>> Yes, other representations of Strings are possible, but is is far from sure that they would be faster overall. The current implementation favours Latin1 (and thus ASCII), because that is so common. In my work image I count them as follows:
>>
>> ByteString allInstances size. "301498"
>>
>> WideString allInstances size. "136"
>>
>> That is less than 0.05%.
>>
>>> On 05 Jan 2016, at 13:08, Hilaire <[hidden email]> wrote:
>>>
>>> Le 04/01/2016 11:05, Henrik Johansen a écrit :
>>>> In the fallback code for WriteStream >> #nextPut:, at:put: is called,  so yes, streaming a wide char causes the streams collection to be converted from Byte to WideString.
>>>> Conversion is done using become, which currently triggers a full heap scan for references, and is thus very slow.
>>>> One could add a fast-path along the lines of #pastEndPut: (which has already broken any assumption that a reference to the collection will reflect all writes for the lifetime of stream, for the same performance problems one would face using #become:); if collection is a ByteString and anObject is a wide characters, replace collection with a WideString, and *then* call at:put:
>>>> But, it is not a very nice thing to add to a generic streaming class, nor is it a very attractive at this point in time considering that making become: a fast operation is one of the problems solved by Spur.
>>> So wait and see for Spur?
>>> To not forget about it, it is recorded here, and it should be kept open
>>> for later check:
>>> https://pharo.fogbugz.com/f/cases/17315/Slow-object-printOn-with-EURO-symbol
>>>
>>> It is  possible to turn around this problem, but this sort of annoyance
>>> with Pharo internal encoding regularly arises, so I am not sure what to
>>> think about the state of Pharo regarding internal encoding. Now days is
>>> not supposed to be all utf-8?
>>>
>>> Thanks
>>>
>>> Hilaire
>>>
>>> --
>>> Dr. Geo
>>> http://drgeo.eu
>>>
>>>
>>>
>>
>>
>
>
> --
> Dr. Geo
> http://drgeo.eu


Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

HilaireFernandes
Le 05/01/2016 14:38, Sven Van Caekenberghe a écrit :
> The problem can does also be restated: it is really necessary for a tool to convert 1000s of items to strings, even if only 10s are shown at the same time on a screen ?
>
> I believe that fast table tries to do better here.
But I need to show 1000s of them once, all in the same view, at the same
time.

Hilaire

--
Dr. Geo
http://drgeo.eu



Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

Henrik Sperre Johansen

> On 05 Jan 2016, at 3:20 , Hilaire <[hidden email]> wrote:
>
> Le 05/01/2016 14:38, Sven Van Caekenberghe a écrit :
>> The problem can does also be restated: it is really necessary for a tool to convert 1000s of items to strings, even if only 10s are shown at the same time on a screen ?
>>
>> I believe that fast table tries to do better here.
> But I need to show 1000s of them once, all in the same view, at the same
> time.
>
> Hilaire
I think Sven was referring to opening an inspector, where you usually only see 10-20 items in a list at once, even if the total number of items is much larger.
The new FastTable/List/Tree/WhateverItsCalled are much better suited for this, instead of creating the 1500 Strings that can potentially be shown before even opening the inspector, it creates the 20 or so that will actually be shown, then generates the rest as you scroll.

Cheers,
Henry

signature.asc (859 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

HilaireFernandes
Le 05/01/2016 15:45, Henrik Johansen a écrit :
>>> I believe that fast table tries to do better here.
>> > But I need to show 1000s of them once, all in the same view, at the same
>> > time.
>> >
>> > Hilaire
> I think Sven was referring to opening an inspector, where you usually only see 10-20 items in a list at once, even if the total number of items is much larger.
> The new FastTable/List/Tree/WhateverItsCalled are much better suited for this, instead of creating the 1500 Strings that can potentially be shown before even opening the inspector, it creates the 20 or so that will actually be shown, then generates the rest as you scroll.
Got it

Thanks

--
Dr. Geo
http://drgeo.eu



Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

Ben Coman
In reply to this post by HilaireFernandes
On Tue, Jan 5, 2016 at 10:20 PM, Hilaire <[hidden email]> wrote:
> Le 05/01/2016 14:38, Sven Van Caekenberghe a écrit :
>> The problem can does also be restated: it is really necessary for a tool to convert 1000s of items to strings, even if only 10s are shown at the same time on a screen ?
>>
>> I believe that fast table tries to do better here.
> But I need to show 1000s of them once, all in the same view, at the same
> time.

Is it that the same 1000 strings are being rendered multiple times a
second?  I had a similar problem with Roassal 1 with a unicode char
massively slowing down to UI.  I managed to achieve fairly normal
response time by cached the Freetype calculated width of the string.
I found the right place to do this by both profiling the whole system,
and just the rendering by wrapping MessaeTally around the code of
fullDrawOn: .

Then I cache both the string and its calculated length from the
rendering code, then if the cached string is the same as current I
just returned the cached length.

Sorry I can't be clearer.
cheers -ben

Reply | Threaded
Open this post in threaded view
|

Re: Slow object printOn: with EURO symbol

HilaireFernandes
I don't think like I want to hack on the system to get normal time
response. I will use #render: when it will gets too slow, even if it
breaks the beauty of the code.
This is the kind of problem where newbie may think Pharo is slow.

Hilaire

Le 05/01/2016 18:59, Ben Coman a écrit :

> On Tue, Jan 5, 2016 at 10:20 PM, Hilaire <[hidden email]> wrote:
>> Le 05/01/2016 14:38, Sven Van Caekenberghe a écrit :
>>> The problem can does also be restated: it is really necessary for a tool to convert 1000s of items to strings, even if only 10s are shown at the same time on a screen ?
>>>
>>> I believe that fast table tries to do better here.
>> But I need to show 1000s of them once, all in the same view, at the same
>> time.
> Is it that the same 1000 strings are being rendered multiple times a
> second?  I had a similar problem with Roassal 1 with a unicode char
> massively slowing down to UI.  I managed to achieve fairly normal
> response time by cached the Freetype calculated width of the string.
> I found the right place to do this by both profiling the whole system,
> and just the rendering by wrapping MessaeTally around the code of
> fullDrawOn: .
>
> Then I cache both the string and its calculated length from the
> rendering code, then if the cached string is the same as current I
> just returned the cached length.
>
> Sorry I can't be clearer.
> cheers -ben
>
>

--
Dr. Geo
http://drgeo.eu