http://ss3.gemtalksystems.com/ss/Tabular.htmlcontains an application example of a SAX parser. You only pick what is
of interest.
On 8/14/15, Vincent Blondeau <
[hidden email]> wrote:
> Hi,
>
> Look at the class side, there is the method parse: namespace: validation: .
> call this method instead of parse: with false in the two last arguments. It
> should work.
>
> Anyway, you should use the sax parser. It is faster and memory less
> consuming. It is very simple to get only one tag.
>
> Cheers
> Vincent
>
> Le 14 août 2015 01:31, Alexandre Bergel <
[hidden email]> a écrit :
>>
>> Hi!
>>
>> Together with Nicolas we are trying to get all the <script …> … </script>
>> from html files.
>> We have tried to use XMLDOMParser, but many webpages are actually not well
>> formed, therefore the parser is complaining.
>>
>> Anyone has tried to get some particular tags from HTML files? This looks
>> like a classical thing to do. Maybe some of you have done it.
>> Is there a way to configure the parser to accept a broken XML/HTML
>> content?
>>
>> Cheers,
>> Alexandre
>> --
>> _,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:
>> Alexandre Bergel
http://www.bergel.eu>> ^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;._,.;:~^~:;.
>>
>>
>> _______________________________________________
>> Moose-dev mailing list
>>
[hidden email]
>>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev>
> _______________________________________________
> Moose-dev mailing list
>
[hidden email]
>
https://www.iam.unibe.ch/mailman/listinfo/moose-dev>