The Inbox: HelpSystem-Core-ct.129.mcz

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

The Inbox: HelpSystem-Core-ct.129.mcz

commits-2
Christoph Thiede uploaded a new version of HelpSystem-Core to project The Inbox:
http://source.squeak.org/inbox/HelpSystem-Core-ct.129.mcz

==================== Summary ====================

Name: HelpSystem-Core-ct.129
Author: ct
Time: 2 March 2020, 10:32:56.341949 am
UUID: 65f1da58-ae12-c14a-a12c-bc9a3b7a08b3
Ancestors: HelpSystem-Core-mt.119

Improves parsing of html help topics

- Detect relative links and convert them to absolute version
- Add support for a cleanseBlock that will be applied to the html body source
- Trim leading and trailing blanks from the text
- Cache contents

Small refactoring:

- Remove unnecessary duplicate parse logic from #subtopics
- Again in #subtopics, don't pass the result of [self fooBlock] but the instance variable fooBlock instead. Don't manifest default values ...

=============== Diff against HelpSystem-Core-mt.119 ===============

Item was changed:
  AbstractHelpTopic subclass: #HtmlHelpTopic
+ instanceVariableNames: 'url level selectBlock convertBlock cleanseBlock document contents subtopicUrls subtopics'
- instanceVariableNames: 'url document selectBlock convertBlock subtopicUrls subtopics level'
  classVariableNames: ''
  poolDictionaries: ''
  category: 'HelpSystem-Core-Model'!

Item was added:
+ ----- Method: HtmlHelpTopic>>cleanseBlock (in category 'accessing') -----
+ cleanseBlock
+ "Answer the block that will be applied to the HTML body source in order to filter relevant information."
+
+ ^ cleanseBlock ifNil: [ [:contents | contents] ]!

Item was added:
+ ----- Method: HtmlHelpTopic>>cleanseBlock: (in category 'accessing') -----
+ cleanseBlock: aBlock
+ "Indicate the block that will be applied to the HTML body source in order to filter relevant information."
+
+ cleanseBlock := aBlock.!

Item was changed:
  ----- Method: HtmlHelpTopic>>contents (in category 'accessing') -----
  contents
 
+ | start end source text rootUrl |
+ contents ifNotNil: [^ contents].
+
+ start := self document findString: '<body'.
- | start end |
- start := (self document findString: '<body').
  start := (self document findString: '>' startingAt: start) + 1.
  end := self document findString: '</body>' startingAt: start.
 
  start > end ifTrue: [^ self document].
 
+ source := self document copyFrom: start to: end - 1.
+ source := self cleanseBlock value: source.
+ text := (source copyReplaceAll: String cr with: '<br>')
+ asTextFromHtml.
+
+ "Convert relative URLs (https://www.w3.org/TR/WD-html40-970917/htmlweb.html#h-5.1.2)"
+ rootUrl := url readStream in: [:urlStream |
+ | host scheme |
+ scheme := urlStream upToAll: '://'.
+ host := urlStream upTo: $/.
+ scheme , '://' , host].
+ (text runs gather: #yourself) withoutDuplicates
+ select: [:attribute | attribute isKindOf: TextURL]
+ thenDo: [:attribute |
+ (attribute info beginsWith: '..')
+ ifTrue: [attribute url: self url , (attribute info skip: 2)].
+ (attribute info beginsWith: '/')
+ ifTrue: [attribute url: rootUrl , attribute info]].
+
+ ^ contents := text withBlanksTrimmed!
- ^ ((self document copyFrom: start to: end - 1)
- copyReplaceAll: String cr with: '<br>')
- asTextFromHtml!

Item was changed:
  ----- Method: HtmlHelpTopic>>subtopics (in category 'accessing') -----
  subtopics
 
- | start end urls |
  subtopics ifNotNil: [^ subtopics].
 
- urls := OrderedCollection new.
-
- start := self document findString: '<a '.
- [start > 0] whileTrue: [
- start := self document findString: 'href' startingAt: start.
- start := (self document findString: '"' startingAt: start) + 1.
- end := self document findString: '"' startingAt: start.
- urls addIfNotPresent: (self document copyFrom: start to: end - 1).
- start := self document findString: '<a ' startingAt: start.].
-
  subtopics := (self subtopicUrls collect: [:aUrl | self class new
  level: self level + 1;
  url: aUrl;
+ selectBlock: selectBlock;
+ convertBlock: convertBlock;
+ cleanseBlock: cleanseBlock]).
- selectBlock: self selectBlock;
- convertBlock: self convertBlock]).
 
  Project current uiProcess == Processor activeProcess
  ifTrue: [self fetchSubtopics].
 
  ^ subtopics!