Hi guys,
I wonder which is the easier way to do the following. I have a string which inside could have something like 'this is a string with <code>some funny lines</code> and here is another <code>haha</code>'. I need to parse that string, get all places where I have things surrounded with <code>SOMETHING</code>, get the "SOMETHING" (in previous example, that would be 'some funny lines'), execute that (this is something internal) , and from that I get the real string (imagine in this case the answer is 'SOME FUNNY LINES'). Finally, I need to replace the orignal string... So ... given the input: 'this is a string with <code>some funny lines</code> and here is another <code>haha</code>' And given my specific domain logic transformation (in this example I assume a simple #asUppercase), I would like to get: 'this is a string with SOME FUNNY LINES and here is another HAHA' I got it working with below lines. But it is a hack and terrible slow (I imagine). So...anyone has an idea how can I do this simpler/faster? Maybe some RB re-write rule? Thanks in advance | dom string originalString stringToBeAbleToParse xmlDocument replacements finalString | replacements := Dictionary new. originalString := 'this is a string with <code>some funny lines</code> and here is another <code>haha</code>'. stringToBeAbleToParse := '<hack>', originalString, '</hack>'. dom := XMLDOMParser on: stringToBeAbleToParse. dom configuration isValidating: false. xmlDocument := dom parseDocument. (xmlDocument allElementsNamed: 'code') do: [ :aXMLElement | "Let's simulate my domain transformation logic as a simple #asUppercase" replacements at: aXMLElement asString put: (([:code | code asUppercase ]) value: aXMLElement nodes first asString). ]. finalString := originalString. replacements keysAndValuesDo: [ :originalText :new | finalString := finalString copyReplaceAll: originalText with: new. ]. finalString |
Hi Mariano,
On Thu, May 12, 2016 at 04:50:29PM -0300, Mariano Martinez Peck wrote: > Hi guys, > > I wonder which is the easier way to do the following. I have a string which > inside could have something like 'this is a string with <code>some funny lines > </code> and here is another <code>haha</code>'. I need to parse that string, > get all places where I have things surrounded with <code>SOMETHING</code>, get > the "SOMETHING" (in previous example, that would be 'some funny lines'), > execute that (this is something internal) , and from that I get the real string > (imagine in this case the answer is 'SOME FUNNY LINES'). Finally, I need to > replace the orignal string... So ... given the input: > > 'this is a string with <code>some funny lines</code> and here is another <code> > haha</code>' > > And given my specific domain logic transformation (in this example I assume a > simple #asUppercase), I would like to get: > > 'this is a string with SOME FUNNY LINES and here is another HAHA' > > I got it working with below lines. But it is a hack and terrible slow (I > imagine). > So...anyone has an idea how can I do this simpler/faster? Maybe some RB > re-write rule? > > Thanks in advance > > > > | dom string originalString stringToBeAbleToParse xmlDocument replacements > finalString | > replacements := Dictionary new. > originalString := 'this is a string with <code>some funny lines</code> and here > is another <code>haha</code>'. > stringToBeAbleToParse := '<hack>', originalString, '</hack>'. > dom := XMLDOMParser on: stringToBeAbleToParse. > dom configuration isValidating: false. > xmlDocument := dom parseDocument. > (xmlDocument allElementsNamed: 'code') do: [ :aXMLElement | > > "Let's simulate my domain transformation logic as a simple # > asUppercase" > replacements at: aXMLElement asString put: (([:code | code asUppercase ]) > value: aXMLElement nodes first asString). > ]. > finalString := originalString. > replacements keysAndValuesDo: [ :originalText :new | > finalString := finalString copyReplaceAll: originalText with: new. > ]. > finalString I don't think this is quite what you want, but it should be close enough to get you started: | str re oc | str := 'this is a string with <code>some funny lines</code> and here is another <code>haha</code>'. re := '<code>([^<]*)</code>' asRegex. re copy: str translatingMatchesUsing: [ :each | each asUppercase]. HTH, Alistair |
In reply to this post by Mariano Martinez Peck
Hi. I always solve such kind of problems with streams. It is super easy and much easy then regex (I hate regex). For your case it would be something like:
And it could be much nicer with Xtreams but I not remember it API (maybe tomorrow I will remember). 2016-05-12 21:50 GMT+02:00 Mariano Martinez Peck <[hidden email]>:
|
And here it is with PetitParser using the islands support. I think this is the nicest to read:
codeParser := ( '<code>' asParser, '</code>' asParser negate star flatten , '</code>' asParser) ==> [ :t | t second asUppercase ]. parser := codeParser island star ==> [: t | '' join: t flatten ]. originalString := 'this is a string with <code>some funny lines</code> and here is another <code>haha</code>'. parser parse: originalString “--> this is a string with SOME FUNNY LINES and here is another HAHA" Cheers, Doru > On May 12, 2016, at 10:35 PM, Denis Kudriashov <[hidden email]> wrote: > > Hi. > > I always solve such kind of problems with streams. It is super easy and much easy then regex (I hate regex). For your case it would be something like: > > in := source readStream. > result := String streamContents: [:out | > [in atEnd] whileFalse: [ > out nextPutAll: (in upToAll: '<code>'). > code := in upToAll: '</code>'. > out nextPutAll: code asUppercase; nextPutAll: '</code>']. > ] > > And it could be much nicer with Xtreams but I not remember it API (maybe tomorrow I will remember). > > 2016-05-12 21:50 GMT+02:00 Mariano Martinez Peck <[hidden email]>: > Hi guys, > > I wonder which is the easier way to do the following. I have a string which inside could have something like 'this is a string with <code>some funny lines</code> and here is another <code>haha</code>'. I need to parse that string, get all places where I have things surrounded with <code>SOMETHING</code>, get the "SOMETHING" (in previous example, that would be 'some funny lines'), execute that (this is something internal) , and from that I get the real string (imagine in this case the answer is 'SOME FUNNY LINES'). Finally, I need to replace the orignal string... So ... given the input: > > 'this is a string with <code>some funny lines</code> and here is another <code>haha</code>' > > And given my specific domain logic transformation (in this example I assume a simple #asUppercase), I would like to get: > > 'this is a string with SOME FUNNY LINES and here is another HAHA' > > I got it working with below lines. But it is a hack and terrible slow (I imagine). > So...anyone has an idea how can I do this simpler/faster? Maybe some RB re-write rule? > > Thanks in advance > > > > | dom string originalString stringToBeAbleToParse xmlDocument replacements finalString | > replacements := Dictionary new. > originalString := 'this is a string with <code>some funny lines</code> and here is another <code>haha</code>'. > stringToBeAbleToParse := '<hack>', originalString, '</hack>'. > dom := XMLDOMParser on: stringToBeAbleToParse. > dom configuration isValidating: false. > xmlDocument := dom parseDocument. > (xmlDocument allElementsNamed: 'code') do: [ :aXMLElement | > > "Let's simulate my domain transformation logic as a simple #asUppercase" > replacements at: aXMLElement asString put: (([:code | code asUppercase ]) value: aXMLElement nodes first asString). > ]. > finalString := originalString. > replacements keysAndValuesDo: [ :originalText :new | > finalString := finalString copyReplaceAll: originalText with: new. > ]. > finalString > > > > > -- > Mariano > http://marianopeck.wordpress.com > -- www.tudorgirba.com www.feenk.com "From an abstract enough point of view, any two things are similar." |
Free forum by Nabble | Edit this page |