Christoph Thiede uploaded a new version of HelpSystem-Core to project The Inbox:
http://source.squeak.org/inbox/HelpSystem-Core-ct.129.mcz ==================== Summary ==================== Name: HelpSystem-Core-ct.129 Author: ct Time: 2 March 2020, 10:32:56.341949 am UUID: 65f1da58-ae12-c14a-a12c-bc9a3b7a08b3 Ancestors: HelpSystem-Core-mt.119 Improves parsing of html help topics - Detect relative links and convert them to absolute version - Add support for a cleanseBlock that will be applied to the html body source - Trim leading and trailing blanks from the text - Cache contents Small refactoring: - Remove unnecessary duplicate parse logic from #subtopics - Again in #subtopics, don't pass the result of [self fooBlock] but the instance variable fooBlock instead. Don't manifest default values ... =============== Diff against HelpSystem-Core-mt.119 =============== Item was changed: AbstractHelpTopic subclass: #HtmlHelpTopic + instanceVariableNames: 'url level selectBlock convertBlock cleanseBlock document contents subtopicUrls subtopics' - instanceVariableNames: 'url document selectBlock convertBlock subtopicUrls subtopics level' classVariableNames: '' poolDictionaries: '' category: 'HelpSystem-Core-Model'! Item was added: + ----- Method: HtmlHelpTopic>>cleanseBlock (in category 'accessing') ----- + cleanseBlock + "Answer the block that will be applied to the HTML body source in order to filter relevant information." + + ^ cleanseBlock ifNil: [ [:contents | contents] ]! Item was added: + ----- Method: HtmlHelpTopic>>cleanseBlock: (in category 'accessing') ----- + cleanseBlock: aBlock + "Indicate the block that will be applied to the HTML body source in order to filter relevant information." + + cleanseBlock := aBlock.! Item was changed: ----- Method: HtmlHelpTopic>>contents (in category 'accessing') ----- contents + | start end source text rootUrl | + contents ifNotNil: [^ contents]. + + start := self document findString: '<body'. - | start end | - start := (self document findString: '<body'). start := (self document findString: '>' startingAt: start) + 1. end := self document findString: '</body>' startingAt: start. start > end ifTrue: [^ self document]. + source := self document copyFrom: start to: end - 1. + source := self cleanseBlock value: source. + text := (source copyReplaceAll: String cr with: '<br>') + asTextFromHtml. + + "Convert relative URLs (https://www.w3.org/TR/WD-html40-970917/htmlweb.html#h-5.1.2)" + rootUrl := url readStream in: [:urlStream | + | host scheme | + scheme := urlStream upToAll: '://'. + host := urlStream upTo: $/. + scheme , '://' , host]. + (text runs gather: #yourself) withoutDuplicates + select: [:attribute | attribute isKindOf: TextURL] + thenDo: [:attribute | + (attribute info beginsWith: '..') + ifTrue: [attribute url: self url , (attribute info skip: 2)]. + (attribute info beginsWith: '/') + ifTrue: [attribute url: rootUrl , attribute info]]. + + ^ contents := text withBlanksTrimmed! - ^ ((self document copyFrom: start to: end - 1) - copyReplaceAll: String cr with: '<br>') - asTextFromHtml! Item was changed: ----- Method: HtmlHelpTopic>>subtopics (in category 'accessing') ----- subtopics - | start end urls | subtopics ifNotNil: [^ subtopics]. - urls := OrderedCollection new. - - start := self document findString: '<a '. - [start > 0] whileTrue: [ - start := self document findString: 'href' startingAt: start. - start := (self document findString: '"' startingAt: start) + 1. - end := self document findString: '"' startingAt: start. - urls addIfNotPresent: (self document copyFrom: start to: end - 1). - start := self document findString: '<a ' startingAt: start.]. - subtopics := (self subtopicUrls collect: [:aUrl | self class new level: self level + 1; url: aUrl; + selectBlock: selectBlock; + convertBlock: convertBlock; + cleanseBlock: cleanseBlock]). - selectBlock: self selectBlock; - convertBlock: self convertBlock]). Project current uiProcess == Processor activeProcess ifTrue: [self fetchSubtopics]. ^ subtopics! |
Free forum by Nabble | Edit this page |