Re: Beginners Digest, Vol 54, Issue 3

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
1 message Options
Reply | Threaded
Open this post in threaded view
|

Re: Beginners Digest, Vol 54, Issue 3

flebber


----------------------------------------------------------------------

Message: 1
Date: Sat, 9 Oct 2010 23:07:56 +1100
From: Sayth Renshaw <[hidden email]>
Subject: [Newbies] Html Parser
To: [hidden email]
Message-ID:
       <[hidden email]>
Content-Type: text/plain; charset=ISO-8859-1

I was wondering if there was a html parser for squeak. I want to
capture data from website and then convert these to xml and export
into an excel program I have.

Is this possible in squeak?


------------------------------

Message: 2
Date: Sat, 9 Oct 2010 09:24:46 -0700
From: Paul C Johnson <[hidden email]>
Subject: Re: [Newbies] Html Parser
To: "A friendly place to get answers to even the most basic questions
       about   Squeak." <[hidden email]>
Message-ID:
       <[hidden email]>
Content-Type: text/plain; charset="utf-8"

I have no Idea.

On Sat, Oct 9, 2010 at 5:07 AM, Sayth Renshaw <[hidden email]>wrote:

> I was wondering if there was a html parser for squeak. I want to
> capture data from website and then convert these to xml and export
> into an excel program I have.
>
> Is this possible in squeak?
> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>



--
Later
Paul
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/beginners/attachments/20101009/f2e3575e/attachment-0001.htm

------------------------------

Message: 3
Date: Sat, 9 Oct 2010 22:12:50 +0530
From: "K. K. Subramaniam" <kksubbu.ml@gmail.com>
Subject: Re: [Newbies] Html Parser
To: [hidden email]
Cc: Sayth Renshaw <[hidden email]>
Message-ID: <201010092212.51050.kksubbu.ml@gmail.com>
Content-Type: Text/Plain;  charset="iso-8859-1"

On Saturday 09 Oct 2010 5:37:56 pm Sayth Renshaw wrote:
> I was wondering if there was a html parser for squeak. I want to
> capture data from website and then convert these to xml and export
> into an excel program I have.
>
> Is this possible in squeak?
Yes. Browse HtmlParser class.

A good way to dig out such information is the Message Finder (world-menu-
>windows->find message names). Or select the string "html" or "parser" and
press CTRL+SHIFT+W.

HTH .. Subbu


------------------------------

Message: 4
Date: Sat, 9 Oct 2010 21:27:44 +0200 (CEST)
From: Levente Uzonyi <[hidden email]>
Subject: Re: [Newbies] Html Parser
To: "A friendly place to get answers to even the most basic questions
       about   Squeak." <[hidden email]>
Message-ID: <[hidden email]>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

On Sat, 9 Oct 2010, Sayth Renshaw wrote:

> I was wondering if there was a html parser for squeak. I want to
> capture data from website and then convert these to xml and export
> into an excel program I have.
>
> Is this possible in squeak?

Yes it is, we are using Soup (http://www.squeaksource.com/Soup.html ) to
parse html files. It's pretty good, though not perfect. There are also 2-3
other html parsers for Squeak. We're using this one because it's designed
to be able to parse not standard compilant html files (which are very
common) The tools for xml building are in the Squeak image, look for
XMLNode and it's subclasses (XMLDocument, XMLNodeWithElements, XMLString,
etc).


Levente

> _______________________________________________
> Beginners mailing list
> [hidden email]
> http://lists.squeakfoundation.org/mailman/listinfo/beginners
>


------------------------------

Message: 5
Date: Sun, 10 Oct 2010 11:21:46 +1100
From: Sayth Renshaw <[hidden email]>
Subject: [Newbies] Re: Html Parser
To: [hidden email]
Message-ID:
       <AANLkTi=[hidden email]>
Content-Type: text/plain; charset="iso-8859-1"

I was trying to follow the guide here
http://softwareengineering.vazexqi.com/2007/05/26/installing-packages-in-squeak
to
install the beautiful soup package. However I cannot locate the Squeakmap
Package Loader.

I know the menu and interface has been updated since this tutorial(and it
looks good) but I am looking in the "old desktop menu" but can't locate it.

Thanks

Sayth.

Yes it is, we are using Soup (http://www.squeaksource.com/Soup.html ) to
parse html files. It's pretty good, though not perfect. There are also 2-3
other html parsers for Squeak. We're using this one because it's designed
to be able to parse not standard compilant html files (which are very
common) The tools for xml building are in the Squeak image, look for
XMLNode and it's subclasses (XMLDocument, XMLNodeWithElements, XMLString,
etc).


Levente
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.squeakfoundation.org/pipermail/beginners/attachments/20101010/4066224b/attachment-0001.htm

------------------------------

Message: 6
Date: Sun, 10 Oct 2010 04:58:08 +0200 (CEST)
From: Levente Uzonyi <[hidden email]>
Subject: Re: [Newbies] Re: Html Parser
To: "A friendly place to get answers to even the most basic questions
       about   Squeak." <[hidden email]>
Message-ID: <[hidden email]>
Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed

On Sun, 10 Oct 2010, Sayth Renshaw wrote:

> I was trying to follow the guide here
> http://softwareengineering.vazexqi.com/2007/05/26/installing-packages-in-squeak
> to
> install the beautiful soup package. However I cannot locate the Squeakmap
> Package Loader.
>
> I know the menu and interface has been updated since this tutorial(and it
> looks good) but I am looking in the "old desktop menu" but can't locate it.

This package (and all packages on squeaksource.com can be installed with
the Monticello Browser or with Installer. If you're using Squeak 4.1 or
4.2 alpha, then you can open Monticello Browser from the Tools menu of the
Docking Bar (on the top of the screen). If you're using an earlier
version of Squeak, then open the desktop menu, select Open... then select
Monticello Browser (btw the SqueakMap Package Loader can also be found in
this menu).

If the Monticello Browser is open, then add the repository of Soup and
load the latest version. If you never used the Monticello Browser, then
you'll find this link useful (the images are a bit outdated, but this part
seems to be ok): http://wiki.squeak.org/squeak/43#Opening%20a%20Repository

If you want to use Installer to install this package, then evaluate the
following in a workspace:

Installer squeaksource
       project: 'Soup';
       install: 'Soup'


Levente

>
> Thanks
>
> Sayth.
>
> Yes it is, we are using Soup (http://www.squeaksource.com/Soup.html ) to
> parse html files. It's pretty good, though not perfect. There are also 2-3
> other html parsers for Squeak. We're using this one because it's designed
> to be able to parse not standard compilant html files (which are very
> common) The tools for xml building are in the Squeak image, look for
> XMLNode and it's subclasses (XMLDocument, XMLNodeWithElements, XMLString,
> etc).
>
>
> Levente
>


------------------------------

_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners


End of Beginners Digest, Vol 54, Issue 3
****************************************

Thank you I have been able to install beautiful soup thanks to your help.

I also found this great resource which others may enjoy and benefit from, it features a lot of video tutorials on beginners to advanced topics on squeak smalltalk.


Thanks

Sayth 



_______________________________________________
Beginners mailing list
[hidden email]
http://lists.squeakfoundation.org/mailman/listinfo/beginners