Hello guys. I want to visualize DNA sequence alignments in Pharo 8. For this task most bioinformatics applications set a background color for each letter. But in Pharo the Inspector is too slow to open even for just one small sequence of 1Kb. Consider now there are about 37k sequences of COVID-19 and each genome contains about 30k of letters, so visualizing and scrolling these should be fast (as for zooming). But have a look at this script which takes about 6 seconds to open an Inspector. The script uses BioSmalltalk, and the code could be enhanced for sure, but that is not relevant to my performance problem of visualization: [ | text attributes | " Generate a Text object from a random sequence " text := ((BioSequence forAlphabet: BioDNAAlphabet) randomLength: 1000) sequence asText. " Setup an array for each nucleotide background color " attributes := Array new: text size. 1 to: text size do: [ : index | attributes at: index put: { (TextBackgroundColor color: (BioDNAAlphabet colorMap at: (text at: index))) } ]. text runs: (RunArray newFrom: attributes). text inspect ] timeToRun asString "'0:00:00:05.911'" Also, resizing the opened Inspector takes 2-3 seconds to refresh. You can see the output here: https://imgur.com/a/xUlBeVY I should say without the #inspect the code ran without performance issues: "'0:00:00:00.009'" So I ran again the script for different sequence sizes: String streamContents: [ : stream | 100 to: 2000 by: 100 do: [ : sl | stream nextPutAll: ([ | text attributes | " Generate a Text object from a random sequence " text := ((BioSequence forAlphabet: BioDNAAlphabet) randomLength: sl) sequence asText. " Setup an array for each nucleotide background color " attributes := Array new: text size. 1 to: text size do: [ : index | attributes at: index put: { (TextBackgroundColor color: (BioDNAAlphabet colorMap at: (text at: index))) } ]. text runs: (RunArray newFrom: attributes). text inspect ] timeToRun asString); cr ] ] And these are the results: 0:00:00:00.147 0:00:00:00.28 0:00:00:00.568 0:00:00:00.993 0:00:00:01.776 0:00:00:02.123 0:00:00:03.111 0:00:00:04.084 0:00:00:04.574 0:00:00:06.192 0:00:00:07.214 0:00:00:07.915 0:00:00:10.382 0:00:00:12.725 0:00:00:12.359 0:00:00:17.357 0:00:00:17.147 0:00:00:20.651 0:00:00:20.392 0:00:00:23.238 At first I thought it was a problem of the Glamout text renderer for Rubric Text, but profiling a single pass of the snippet for 2000 letters, shows a couple of methods in Rubric scanner, after some DNU sends, which are consuming a lot of the time: RubCharacterBlockScanner(RubCharacterBlockScanner) >> characterBlockAtPoint:index:in: and RubCharacterBlockScanner(RubCharacterBlockScanner) >> endOfRun". I attached the full profiler report so you may have a look if you like. But the summary is: **Leaves** 37.4% {8800ms} RubCompositionScanner(RubCharacterScanner)>>basicScanCharactersFrom:to:in:rightX:stopConditions:kern: 6.3% {1476ms} Dictionary>>at:ifAbsentPut: 6.1% {1425ms} Context>>unwindComplete 4.6% {1082ms} Semaphore>>criticalReleasingOnError: 4.2% {991ms} Dictionary>>at:ifAbsent: 3.3% {785ms} Context>>aboutToReturn:through: 2.2% {527ms} Context>>resume:through: 2.0% {470ms} ExternalAddress>>isNull 1.8% {421ms} BlockClosure>>on:do: 1.7% {402ms} RubCharacterBlockScanner(RubCharacterScanner)>>setConditionArray: 1.6% {378ms} FreeTypeFace>>validate 1.6% {376ms} Dictionary>>scanFor: 1.5% {364ms} Context>>unwindComplete: 1.5% {344ms} Context>>unwindBlock 1.4% {323ms} Array(SequenceableCollection)>>do: 1.3% {299ms} Dictionary(HashedCollection)>>findElementOrNil: 1.2% {293ms} RunArray>>at:setRunOffsetAndValue: 1.2% {289ms} FreeTypeCache>>atFont:charCode:type:ifAbsentPut: 1.1% {252ms} FreeTypeCacheLinkedList>>moveDown: **Memory** old +0 bytes young -1,485,272 bytes used -1,485,272 bytes free +1,485,272 bytes **GCs** full 0 totalling 0ms (0.0% uptime) incr 947 totalling 1,576ms (7.0% uptime), avg 2.0ms tenures 0 root table 0 overflows So my question is, is there any other text rendering backends to try? And when I say backends I say which don't use Rubric. Cheers, Hernán Profile_DNABgColoring.txt (38K) Download Attachment |
El sáb., 22 feb. 2020 a las 5:22, Stéphane Ducasse (<[hidden email]>) escribió:
|
Free forum by Nabble | Edit this page |