Swiki API or Scrapers

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
4 messages Options
tty
Reply | Threaded
Open this post in threaded view
|

Swiki API or Scrapers

tty

Good Morning.

Is anybody aware of a Swiki API or Scraper for grabbing and organizing the Swiki content?

Motive: 

I am in the very early alpha stages  for SeasideDoc http://menmachinesmaterials.com/SeasideDoc  

Currently, I am coding so that the native Squeak Help classes render as html5 via Seaside.

On the radar is grabbing the content from http://wiki.squeak.org/ , organize it "somehow" and displaying it as well.


thnx in advance




p.s. Goals for SeasideDoc include 
1. Native Squeak Help -> Html5 (current work)
2. Squeak Wiki -> SeasideDoc organized by "tags, topics, something"
3. "stack exchange" comment and entry style system
4. persistence
5. Exporting from HTML5 to Squeak Help, texinfo, "foo" formats.




_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
Reply | Threaded
Open this post in threaded view
|

Re: Swiki API or Scrapers

Pierce Ng-3
On Fri, Oct 26, 2018 at 03:15:17AM -0400, gettimothy wrote:
> Good Morning. Is anybody aware of a Swiki API or Scraper for grabbing
> and organizing the Swiki content? Motive:  I am in the very early
> alpha stages  for
> SeasideDoc http://menmachinesmaterials.com/SeasideDoc   Currently, I
> am coding so that the native Squeak Help classes render as html5 via
> Seaside. On the radar is grabbing the content
> from http://wiki.squeak.org/ , organize it "somehow" and displaying it

I recall there is an export of the Squeak wiki content, each page
being saved as an XML file.

Ah found the convo:

  http://forum.world.st/editing-the-Squeak-wiki-XML-file-library-td4786516.html

One possibility might be to ask the current maintainers of the wiki to
do another export. Or, set up a mirror of that wiki, and you'll have all
the content in your mirror instance.

Pierce
_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
tty
Reply | Threaded
Open this post in threaded view
|

Re: Swiki API or Scrapers

tty
Pierce

thank you.


The dropbox link is dead, but I will post this thread to the squeak-dev list.

It its XML, I *should* be able to figure something out with the squeak XML tools.

thanks again.

tty

---- On Fri, 26 Oct 2018 21:34:38 -0400 Pierce Ng <[hidden email]> wrote ----

On Fri, Oct 26, 2018 at 03:15:17AM -0400, gettimothy wrote:
> Good Morning. Is anybody aware of a Swiki API or Scraper for grabbing
> and organizing the Swiki content? Motive:  I am in the very early
> alpha stages  for
> SeasideDoc http://menmachinesmaterials.com/SeasideDoc   Currently, I
> am coding so that the native Squeak Help classes render as html5 via
> Seaside. On the radar is grabbing the content
> from http://wiki.squeak.org/ , organize it "somehow" and displaying it

I recall there is an export of the Squeak wiki content, each page
being saved as an XML file.

Ah found the convo:


One possibility might be to ask the current maintainers of the wiki to
do another export. Or, set up a mirror of that wiki, and you'll have all
the content in your mirror instance.

Pierce
_______________________________________________
seaside mailing list



_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
tty
Reply | Threaded
Open this post in threaded view
|

Swiki API Scraper data rendering on SeasideDoc

tty
In reply to this post by Pierce Ng-3
Hi Hannes,

The first 10 pages of the Swiki show up:


under left menu item Swiki.

Ugly, but there.(:


See
SeasideDocSwikiProxy >> scrapeWikiPagesToNewClassFrom: aInteger To: zInteger  pausingBetweenBy: delayInteger 
for how it is done.


On  the DocletSeasideDoc (home doclet) I updated the Roadmap
Go to Swiki Page 3 to see an example of a page that should be culled.

I have also added menu items for  Blogs, Video, Books.

Adding web interface to add/edit  these items will happen.

Regarding the Swiki pages...

The classes are created, method created and compiled storing the page HTML.

I guess its good to keep that raw HTML as the baseline and instead present cleaned up versions of that raw stuff.

I will play around with that.


In summary, the large "Get existing Help data and display it" Subsystems are in place in Alpha form.

This thing will get into iterative dev mode pretty soon.

Let me know when you think it is worthy to store in an official Squeak Repo and which repo we should store it in.

cheers,

t





_______________________________________________
seaside mailing list
[hidden email]
http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside