On 05-09-2013, at 4:59 PM, Yoshiki Ohshima <[hidden email]> wrote: > On Wed, Sep 4, 2013 at 8:24 PM, tim Rowledge <[hidden email]> wrote: >> After simplifying the scanning code a bit I'm looking into why we have the seemingly insane situation of two parallel hierarchies of CharacterScanner. So far it looks like there are no really substantive differences between CharacterScanner and MultiCharacterScanner and their subclasses. This seems like a mistake somewhere; certainly it could be mine, missing something important. > > It's all my fault and incompetence. I am sorry. Well, it might be your 'fault' but I rather doubt it was incompetence… > >> What is the intent of MultiXXXXX ? What is CombinedChar for? Are they, honestly, still needed? Or should the older versions be removed instead? Who wrote the new classes and is that person still maintaining them? Is he/she still around here? > > This kind of stuff touches the part of Squeak that *has to* work. > Once the "MultiCharacterScanner" worked and people were confident, it > was in theory possible to ditch the old implementation; but I did not > think back then that it (replacing fundamental code with a > "work-in-progress" version) was acceptable to the community. IF there > was enough man-power, there would have been more variation of such > scanners implemented for different writing systems; keeping the > original version that works for byte strings would have been useful > under that light. So if I understand you correctly, there *should* be no particular differences in what the two types of scanner do? You made a parallel set in order to insulate your work from the tools that you needed to keep working in order to keep making the i18n stuff? I've worked through several of the scanners without finding any major differences, but not yet all of them. It certainly looks to me that there is nothing to stop us having only one set. I suspect there may be some bug fixes in the more recently created classes, though I did notice at least a couple of places where the method in the old scanner class was actually newer than its equivalent in the new scanner. Do you recall any serious changes made to support multi-byte strings? > > CombinedChar creates a precomposed character from a sequence of > decomposed form of Unicode when possible. For a certain keyboard, it > was needed. Ah, yes now I see . Should CombinedChars ever exist outside that very narrow area of reading the keyboard and then copying out the results to the paragraphs? I didn't see any use beyond that but it can be hard to trace everything. If it's actually possible to simplify and get rid of a duplication of classes it would be nice to clean up! Right now I'm thinking about refactoring to allow the class of the string and the font to be used instead of explicit tests for widestring and font-does-kerning etc. It seems to me that modern font systems are much more 'active' than we used to think of StrikeFonts being and maybe it is time fonts did their own scanning. That way it could be via simple methods, a prim or even a call out to a library. I'm aiming to make sure that the simple cases work really fast on slow machines (can we say Raspberry Pi?) and the complex cases at least work decently. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: RDR: Rotate Disk Right |
On Thu, Sep 5, 2013 at 5:21 PM, tim Rowledge <[hidden email]> wrote:
> > On 05-09-2013, at 4:59 PM, Yoshiki Ohshima <[hidden email]> wrote: > >>> What is the intent of MultiXXXXX ? What is CombinedChar for? Are they, honestly, still needed? Or should the older versions be removed instead? Who wrote the new classes and is that person still maintaining them? Is he/she still around here? >> >> This kind of stuff touches the part of Squeak that *has to* work. >> Once the "MultiCharacterScanner" worked and people were confident, it >> was in theory possible to ditch the old implementation; but I did not >> think back then that it (replacing fundamental code with a >> "work-in-progress" version) was acceptable to the community. IF there >> was enough man-power, there would have been more variation of such >> scanners implemented for different writing systems; keeping the >> original version that works for byte strings would have been useful >> under that light. > > So if I understand you correctly, there *should* be no particular differences in what the two types of scanner do? You made a parallel set in order to insulate your work from the tools that you needed to keep working in order to keep making the i18n stuff? Not quite. The analogy for WideString and String was like LargeInteger and SmallInteger, and CharacterScanner was like a different implementation of #+. MultiCharacterScanner handles WideStrings, especially when there are characters with different leading chars are involved. So the functionality is different. > I've worked through several of the scanners without finding any major differences, but not yet all of them. It certainly looks to me that there is nothing to stop us having only one set. I suspect there may be some bug fixes in the more recently created classes, though I did notice at least a couple of places where the method in the old scanner class was actually newer than its equivalent in the new scanner. Do you recall any serious changes made to support multi-byte strings? The serious change was for handling leading char, and also the different line breaking rules for different languages. >> CombinedChar creates a precomposed character from a sequence of >> decomposed form of Unicode when possible. For a certain keyboard, it >> was needed. > > Ah, yes now I see . Should CombinedChars ever exist outside that very narrow area of reading the keyboard and then copying out the results to the paragraphs? I didn't see any use beyond that but it can be hard to trace everything. Whenever you want to find out a sequence is composable, it is potentially useful. -- -- Yoshiki |
Hi Yoshiki, I also note that there is a presentation and presentationLine in MultiCharacterScanner, could you tell a word about theses inst. vars.?2013/9/6 Yoshiki Ohshima <[hidden email]>
|
At Fri, 6 Sep 2013 23:02:57 +0200,
Nicolas Cellier wrote: > > Hi Yoshiki, > I also note that there is a presentation and presentationLine in > MultiCharacterScanner, could you tell a word about theses inst. vars.? That (IIRC) was also something to do with the mapping from the logical sequence of code points (that is what a Unicode string is) to the list of "characters" that can be used to fetch the glyphs. IOW, "presentation" is something created by looking at combinations in the logical sequence. Again, we did not go too far; I think we supported a simple accented characters but not much more. -- Yoshiki |
On 06-09-2013, at 3:26 PM, Yoshiki Ohshima <[hidden email]> wrote: > At Fri, 6 Sep 2013 23:02:57 +0200, > Nicolas Cellier wrote: >> >> Hi Yoshiki, >> I also note that there is a presentation and presentationLine in >> MultiCharacterScanner, could you tell a word about theses inst. vars.? > > That (IIRC) was also something to do with the mapping from the logical > sequence of code points (that is what a Unicode string is) to the list > of "characters" that can be used to fetch the glyphs. IOW, > "presentation" is something created by looking at combinations in the > logical sequence. > > Again, we did not go too far; I think we supported a simple accented > characters but not much more. So far as I could tell from looking at senders and implementors, there wasn't really any use made of 'presentation' and not much of 'presentationLine'; certainly little enough that my *guess* would be they could go away without changing anything. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Useful random insult:- Suffers from permanent rapture of the deep. (Nitrogen narcosis.) |
I should add that I'm keeping notes in mantis 1650 - http://bugs.squeak.org/view.php?id=1650 - for anyone interested in offering improvements or explanations or even promises of treats upon success.
tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim 42.7 percent of all statistics are made up on the spot. |
In reply to this post by timrowledge
At Fri, 6 Sep 2013 16:51:56 -0700,
tim Rowledge wrote: > > On 06-09-2013, at 3:26 PM, Yoshiki Ohshima <[hidden email]> wrote: > > > At Fri, 6 Sep 2013 23:02:57 +0200, > > Nicolas Cellier wrote: > >> > >> Hi Yoshiki, > >> I also note that there is a presentation and presentationLine in > >> MultiCharacterScanner, could you tell a word about theses inst. vars.? > > > > That (IIRC) was also something to do with the mapping from the logical > > sequence of code points (that is what a Unicode string is) to the list > > of "characters" that can be used to fetch the glyphs. IOW, > > "presentation" is something created by looking at combinations in the > > logical sequence. > > > > Again, we did not go too far; I think we supported a simple accented > > characters but not much more. > > So far as I could tell from looking at senders and implementors, > there wasn't really any use made of 'presentation' and not much of > 'presentationLine'; certainly little enough that my *guess* would be > they could go away without changing anything. In the Etoys image, #addCharToPresentation: is called from #scanMultiCharactersCombiningFrom:...., and that is dispatched from Unicode class>>scanSelector. But I guess the mechanism was removed from the trunk at some point in the past. -- Yoshiki |
Yep, I did change it in following commit, unfortunately I omitted to tell exactly which was the problem... Name: Multilingual-nice.116 Author: nice Time: 27 March 2010, 11:22:00.573 pm UUID: 6339699b-51ec-fb41-a1e0- c8246b621919 Ancestors: Multilingual-ul.115 Don't let Unicode use #scanMultiCharactersCombiningFrom:to:in:rightX:stopConditions:kern: until problems are fixed. Anyway, combining diacritical was experimental and not really operational. 2013/9/9 Yoshiki Ohshima <[hidden email]> At Fri, 6 Sep 2013 16:51:56 -0700, |
At Thu, 12 Sep 2013 00:17:31 +0200,
Nicolas Cellier wrote: > > Yep, I did change it in following commit, unfortunately I omitted to tell > exactly which was the problem... Ah ha. So, the fact that it is incomprehensible now is not solely on me^^; -- Yoshiki |
In reply to this post by timrowledge
On 4 September 2013 03:21, tim Rowledge <[hidden email]> wrote:
(sorry for being late on topic) In Pharo, we certainly do. But FreeType code also needs a decent review and cleanup for sure. The freetype packages is an integral part of pharo base image, and maintained there. As for its plugin, i even managed to fix a bug recently in it.. in primitive which nobody uses though.. mainly because i had plans to use it, but i haven't time to play an experiment, yet. (i am still thinking , maybe naively, that FT rendering speed can be improved). Concerning MultiCharacterMeetPainfulDeathWhenStaringAtIt ... it just impossible to do something with it, especially considering that it used for rendering text, and if you break/change it , you won't be able to fix it (because the image is using the very same code which you just broke to render all text.. ). At least in Pharo, we decided to not even try to do something about it (and as far as i know, nobody tries to do anything for years), instead we decided to write things from scratch,and when it will be ready, throw all this out without a bit of regret. Useful random insult:- Not much to show for four billion years of evolution. -- Best regards, Igor Stasenko. |
On 12-09-2013, at 3:30 PM, Igor Stasenko <[hidden email]> wrote: > > > > On 4 September 2013 03:21, tim Rowledge <[hidden email]> wrote: > > On 02-09-2013, at 12:45 PM, tim Rowledge <[hidden email]> wrote: > > > Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is moribund to me. > > Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType. > > > (sorry for being late on topic) > > In Pharo, we certainly do. But FreeType code also needs a decent review and cleanup for sure. Most code does; little code ever gets one. > The freetype packages is an integral part of pharo base image, and maintained there. > As for its plugin, i even managed to fix a bug recently in it.. in primitive which nobody uses though.. > mainly because i had plans to use it, but i haven't time to play an experiment, yet. (i am still thinking > , maybe naively, that FT rendering speed can be improved). Probably the smart thing is to use whatever the best, most fastidiously maintained library with the widest platform spread. Go for something that makes good use of GPUs. If FreeType is that, it's worth the pain. Probably. Just make sure it runs on ARM based machines from the get-go or it will become irrelevant within a few years. Intel is being eaten alive. > > Concerning MultiCharacterMeetPainfulDeathWhenStaringAtIt ... it just impossible to do something with it, > especially considering that it used for rendering text, and if you break/change it , you won't be able to fix it > (because the image is using the very same code which you just broke to render all text.. ). > At least in Pharo, we decided to not even try to do something about it (and as far as i know, nobody tries > to do anything for years), instead we decided to write things from scratch,and when it will be ready, throw > all this out without a bit of regret. Oh you wimps! Where is the fun in that? If it doesn't involve quantum transforms via irrational phase-space dimensions while reciting Vogon Poetry in Klingon then it isn't difficult enough. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: BW: Branch on Whim |
In reply to this post by Igor Stasenko
On Fri, 13 Sep 2013, Igor Stasenko wrote:
> > > > On 4 September 2013 03:21, tim Rowledge <[hidden email]> wrote: > > On 02-09-2013, at 12:45 PM, tim Rowledge <[hidden email]> wrote: > > > Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. That implies it is > moribund to me. > > Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType. > > > (sorry for being late on topic) > > In Pharo, we certainly do. But FreeType code also needs a decent review and cleanup for sure. > The freetype packages is an integral part of pharo base image, and maintained there. > As for its plugin, i even managed to fix a bug recently in it.. in primitive which nobody uses though.. > mainly because i had plans to use it, but i haven't time to play an experiment, yet. (i am still thinking > , maybe naively, that FT rendering speed can be improved). > > Concerning MultiCharacterMeetPainfulDeathWhenStaringAtIt ... it just impossible to do something with it, > especially considering that it used for rendering text, and if you break/change it , you won't be able to fix it > (because the image is using the very same code which you just broke to render all text.. ). your image. Levente > At least in Pharo, we decided to not even try to do something about it (and as far as i know, nobody tries > to do anything for years), instead we decided to write things from scratch,and when it will be ready, throw > all this out without a bit of regret. > > > > > > tim > -- > tim Rowledge; [hidden email]; http://www.rowledge.org/tim > Useful random insult:- Not much to show for four billion years of evolution. > > > > > > > -- > Best regards, > Igor Stasenko. > > |
In reply to this post by timrowledge
On 4 September 2013 19:41, tim Rowledge <[hidden email]> wrote:
i would be happy to fix that (by adding you as developer), but squeaksource today living its last days, and does not persists project changes (including adding developer(s)), means that every time SqS image reboot you will need to ask to be added again and again. Today, we placed all relevant Pharo VM sources on github (yes, including all .st sources in form of FileTree) and freetype plugin is one of them, and we're going to support it there. Needless to say, that everyone is invited to contribute there. https://github.com/pharo-project/pharo-vm I *might* conclude that the best way to make things better is to 'break' it in the sense of changing image code enough that the plugin would need changing to keep working - and if that happens and nobody chooses to make similar changes and support them in Pharo/Cuis, then I guess I'd have to fork the plugin code. if i understood correctly , a font set is like a font(s) of same family (lets say Arial), but with different variants: bold/italic/bold+italic.. held in its fontArray ivar.. and then there are places which using a 'font index' (see TextFontChange) to point to a concrete font in that set.. Which is utterly ugly, i would say. And yes, there is no comments, since "smalltalk code is self-explanatory" :) To confuse things a bit more, there is a HostFont class that is rather oddly a subclass of StrikeFont and yet seems to load some platform format files (hard to say what since NO DAMN COMMENTS) which include potential kerning data - and then so far as I can see ignores all that and just makes a plain strike font. Oh - and nobody but win32 has the relevant FontPlugin anyway. Obvious question here is whether this is work to be kept and expanded or dropped in favour of FreeType or some other portable system. The code hasn't been updated in 9 years according to SVN, so I suspect it is dead. You bet! :) Useful random insult:- Has a one-way ticket on the Disoriented Express. -- Best regards, Igor Stasenko. |
In reply to this post by Levente Uzonyi-2
On 13 September 2013 00:44, Levente Uzonyi <[hidden email]> wrote:
that, i think was exactly the reason why we having CharacterScanner and MultiCharacterScanner in our images today.. so, rephrasing: "it has been tried once, and it doesn't seems to help" :)
-- Best regards, Igor Stasenko. |
In reply to this post by timrowledge
On 13 September 2013 00:43, tim Rowledge <[hidden email]> wrote:
Amen.
AFAIK, Esteban was managed to build FT2 library on iOS.. and can use it in iOS VMs so, from that perspective, i think, it is safe choice.
As for GPUs & stuff: even, if you don't use FT2 library for rendering, but just for reading the font(s) data and extracting all necessary info (like char/glyph mappings, kerning and glyph metrics + outlines) that functionality alone, is good/heavy enough reason to keep using it.
Well, if there would be any hope that it can be shaped into something more or less ugly (instead of horrifying), i would be all hands for it. The main issue is that code has a lot of assumptions about font(s) and laying the glyphs in many places.. and things are heavily optimized to render first 256 characters by old primitive (which doesn't works btw) Freetype fonts can't use that primitive, only raster fonts can.. other assumptions about how fonts are rendered and by what.. a lot of these concerns is mixed in single place.. so i do not see a solution there. We decided to go more radical way in Pharo: we build a new text model (replacement for Text class) and new layout and rendering engine for it. (oh and if you think it is less difficult than dealing with old code, i afraid i must upset you ;). Text domain is inherently complex. Strange OpCodes: BW: Branch on Whim -- Best regards, Igor Stasenko. |
In reply to this post by Igor Stasenko
On Fri, 13 Sep 2013, Igor Stasenko wrote:
> > > > On 13 September 2013 00:44, Levente Uzonyi <[hidden email]> wrote: > On Fri, 13 Sep 2013, Igor Stasenko wrote: > > > > > On 4 September 2013 03:21, tim Rowledge <[hidden email]> wrote: > > On 02-09-2013, at 12:45 PM, tim Rowledge <[hidden email]> wrote: > > > Who, if anyone, is maintaining the FreeType package? Who, if anyone, is using it? It has some rather old methods that nastily over-ride more recent methods in the trunk image. > That implies it is > moribund to me. > > Well it certainly sounds like nobody cares about FreeType. I guess that means nobody will mind as I rewrite some of the low-level font/scanner code and almost certainly break FreeType. > > > (sorry for being late on topic) > > In Pharo, we certainly do. But FreeType code also needs a decent review and cleanup for sure. > The freetype packages is an integral part of pharo base image, and maintained there. > As for its plugin, i even managed to fix a bug recently in it.. in primitive which nobody uses though.. > mainly because i had plans to use it, but i haven't time to play an experiment, yet. (i am still thinking > , maybe naively, that FT rendering speed can be improved). > > Concerning MultiCharacterMeetPainfulDeathWhenStaringAtIt ... it just impossible to do something with it, > especially considering that it used for rendering text, and if you break/change it , you won't be able to fix it > (because the image is using the very same code which you just broke to render all text.. ). > > > Why can't you just create a copy and modify that? That way you won't break your image. > > that, i think was exactly the reason why we having CharacterScanner and MultiCharacterScanner in our images > today.. > so, rephrasing: "it has been tried once, and it doesn't seems to help" :) Levente |
OK, after a couple of weeks rewriting how Scratch draws tiles I'm back to looking at dear old scanners and fonts.
After side-by-side comparing the old scanners with the new 'Multi' scanners, my conclusion is that there is *very* little difference and we really ought to be able to go back to a single set of classes. Which, I claim, would be nice, since we've already visibly suffered from the obvious side-effect of having two trees as bug fixes get added to only one part. So far as I could tell the only substantive difference relates to the use of the presentation/presentationLine ivars which seems to be not very important (ref Yoshiki's message 8 sept) and even seems to be mostly inactive (ref nice's message regarding Multilingual-nice.116 same date). It would be really nice to get a solid decision on whether it is still wanted, or should be removed, or if it needs some fixes that can be provided. I'm puzzled by quite a few things I've discovered. a) There are {language}environment classes and encoding classes. There is #isBreakableAt:in: implemented in both but seemingly unused in the encoding classes because it is just plain broken there. Should it be removed from the encoders? In the language environment classes it is implemented to return true for space and cr by default, but space, cr & lf in Latin1 and Latin2. Is that as expected? b) MultiCharacterScanner>setConditionArray: cuts out the handling of #space - any ideas why? c) as previously mentioned MultiCharacterScanner>addCharToPresentation: is currently unused, apparently because of issues with Unicode & #scanMultiCharactersCombiningFrom:to:in:rightX:stopConditions:kern: - do we have any decent hope of a fix? d) MultiCanvasCharacterScanner>setFont uses baselineY differently to its CanvasCharacterScanner equivalent; why? e) TextComposer>composeLinesFrom:to:delta:into:…. differs minimally from MultiTextComposer>multiComposeLinesFrom:to:delta:into:…. and is the only sender of #canComputeDefaultLineHeight. What is the intent? Is this just a bug fix added in one place and not the other? f) one of the oddest - DisplayScanner>displayLine:offset:leftInRun: passes the displayString:.. to the font (which I see as good) but MultiDisplayScanner uses the bitlblt instead which seems quite wrong. That'll do for now. I hope we can get some information together to allow this to be improved since it should really simplify an important area of code that gets a lot of exercise and needs to be as fast as possible. I know most of you are running 128 core 75GHz machines with 42Tb of ram and so on, so it hardly matters, but there are over a million Pi's trying to run Scratch that need help. And there will almost certainly be *many* millions more trying to use Scratch and Squeak over the next year or so, scattered across parts of the world where a Pi is more computer than anyone could have imagined a year ago - Pi is probably going to be *the* platform in sub-Saharn Africa and much of Asia. Let's at least try to make it good, eh? This is what Smalltalk has always claimed to be about. tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim Strange OpCodes: AGO: Allow Games Only |
I really like the effort you put into cleaning up the mess. I have briefly looked at this code in the past and it is quite convoluted and hard to understand what and where settings get set.
I tried to hunt down a issue in the Etoys image that a bug can cause certain characters to change color when the image get in a shaky state. (which happens quite often when I test ideas) Karl On Sat, Sep 21, 2013 at 5:08 AM, tim Rowledge <[hidden email]> wrote: OK, after a couple of weeks rewriting how Scratch draws tiles I'm back to looking at dear old scanners and fonts. |
In reply to this post by timrowledge
I'm going to be very brave and commit my changes so far; nothing has broken on my system since making the changes. Obviously there is some chance it will be necessary to rewind but I consider it low.
tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim A)bort, R)etry or S)elf-destruct? |
On 23-09-2013, at 12:40 PM, tim Rowledge <[hidden email]> wrote: > I'm going to be very brave and commit my changes so far; nothing has broken on my system since making the changes. Obviously there is some chance it will be necessary to rewind but I consider it low. Both packages (Graphics-tpr.226 & Multilingual-tpr.170) committed ok. REALLY IMPORTANT NOTE - the FreeType package needs two changes that I cannot make, lacking permissions to hit that file. 1) AbstractFont>widthAndKernedWidthOfLeft: leftCharacter right: rightCharacterOrNil into: aTwoElementArray "Set the first element of aTwoElementArray to the width of leftCharacter and the second element to the width of left character when kerned with rightCharacterOrNil. Answer aTwoElementArray" "Actually, nearly all users of this actually want just the widthOf the leftCharacter, so we will default to that for speed. See other implementations for more complex cases - and note that this may be a temporary fix until scanners are improved" | w | w := self widthOf: leftCharacter. aTwoElementArray at: 1 put: w. aTwoElementArray at: 2 put: w " The old code, and what fonts which have pair-kerning would use - w := self widthOf: leftCharacter. rightCharacterOrNil isNil ifTrue:[ aTwoElementArray at: 1 put: w; at: 2 put: w] ifFalse:[ k := self kerningLeft: leftCharacter right: rightCharacterOrNil. aTwoElementArray at: 1 put: w; at: 2 put: w+k]. ^aTwoElementArray " 2) FreeTypeFont (or whichever is the right name) >isPairKerningCapable "a hopefully temporary test method; better factoring of scan/measure/display should remove the need for it. Only FreeType fonts would currently add this to return true" ^true tim -- tim Rowledge; [hidden email]; http://www.rowledge.org/tim I'm so skeptical that I'm not sure I'm really a skeptic |
Free forum by Nabble | Edit this page |