Hi, Can you advise me a parcel which
can parse some html. I tried with an xmlparser but
even the google page is not xhtml valid, the meta tags are not closed. Maybe the xml parser is a little
too strict… I saw the twoFlower parser but
it seems to be old and not maintained. Ideas ? Thanks _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
2007/4/12, Sylvain Pralon <[hidden email]>:
> > > > > Hi, > > > > Can you advise me a parcel which can parse some html. > > I tried with an xmlparser but even the google page is not xhtml valid, the > meta tags are not closed. > Maybe the xml parser is a little too strict… No, it just happens to parse XML and not HTML as the name says. > > > I saw the twoFlower parser but it seems to be old and not maintained. > > > > Ideas ? For Squeak there is: http://www.squeaksource.com/htmlcssparser.html Philippe > > > Thanks > > > > > _______________________________________________ > Seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside > > _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
I am on visualWorks so I 'll look for a kind of equivalence
-----Message d'origine----- De : [hidden email] [mailto:[hidden email]] De la part de Philippe Marschall Envoyé : jeudi 12 avril 2007 13:32 À : Seaside - general discussion Objet : Re: [Seaside] HTML parser 2007/4/12, Sylvain Pralon <[hidden email]>: > > > > > Hi, > > > > Can you advise me a parcel which can parse some html. > > I tried with an xmlparser but even the google page is not xhtml valid, > the meta tags are not closed. And it doesn't claim to be. > Maybe the xml parser is a little too strict… No, it just happens to parse XML and not HTML as the name says. > > > I saw the twoFlower parser but it seems to be old and not maintained. > > > > Ideas ? For Squeak there is: http://www.squeaksource.com/htmlcssparser.html Philippe > > > Thanks > > > > > _______________________________________________ > Seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside > > _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Philippe Marschall
2007/4/12, Sylvain Pralon <[hidden email]>:
> I am on visualWorks so I 'll look for a kind of equivalence If everything else fails, you can pass the website through tidy and get wellformed (and valid) xhtml Philippe > -----Message d'origine----- > De : [hidden email] [mailto:[hidden email]] De la part de Philippe Marschall > Envoyé : jeudi 12 avril 2007 13:32 > À : Seaside - general discussion > Objet : Re: [Seaside] HTML parser > > 2007/4/12, Sylvain Pralon <[hidden email]>: > > > > > > > > > > Hi, > > > > > > > > Can you advise me a parcel which can parse some html. > > > > I tried with an xmlparser but even the google page is not xhtml valid, > > the meta tags are not closed. > > And it doesn't claim to be. > > > Maybe the xml parser is a little too strict… > > No, it just happens to parse XML and not HTML as the name says. > > > > > > > I saw the twoFlower parser but it seems to be old and not maintained. > > > > > > > > Ideas ? > > For Squeak there is: > http://www.squeaksource.com/htmlcssparser.html > > Philippe > > > > > > > > Thanks > > > > > > > > > > _______________________________________________ > > Seaside mailing list > > [hidden email] > > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside > > > > > > _______________________________________________ > Seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside > _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
In reply to this post by Philippe Marschall
You can probably port it without too much trouble - it mostly relies
on streams IIRC. It is pretty forgiving of rotten input. -Todd Blanchard On Apr 12, 2007, at 5:54 AM, Sylvain Pralon wrote: > I am on visualWorks so I 'll look for a kind of equivalence > > > -----Message d'origine----- > De : [hidden email] [mailto:seaside- > [hidden email]] De la part de Philippe Marschall > Envoyé : jeudi 12 avril 2007 13:32 > À : Seaside - general discussion > Objet : Re: [Seaside] HTML parser > > 2007/4/12, Sylvain Pralon <[hidden email]>: >> >> >> >> >> Hi, >> >> >> >> Can you advise me a parcel which can parse some html. >> >> I tried with an xmlparser but even the google page is not xhtml >> valid, >> the meta tags are not closed. > > And it doesn't claim to be. > >> Maybe the xml parser is a little too strict… > > No, it just happens to parse XML and not HTML as the name says. > >> >> >> I saw the twoFlower parser but it seems to be old and not maintained. >> >> >> >> Ideas ? > > For Squeak there is: > http://www.squeaksource.com/htmlcssparser.html > > Philippe > > >> >> >> Thanks >> >> >> >> >> _______________________________________________ >> Seaside mailing list >> [hidden email] >> http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside >> >> > > _______________________________________________ > Seaside mailing list > [hidden email] > http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside _______________________________________________ Seaside mailing list [hidden email] http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside |
Free forum by Nabble | Edit this page |