Saving pictures on a web site from within Dolphin

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
2 messages Options
Reply | Threaded
Open this post in threaded view
|

Saving pictures on a web site from within Dolphin

Mikael Svane
I am writing a small progam that will automatically download pictures from a
web site (a book that has been scanned and is stored as 358 different
pictures at 'http://runeberg.org/geodet/0001.html' and so on). Basically,
the behaviour that I want is the same as one gets by right-clicking on the
picture in Internet Explorer and choosing "save link as..." but automatic. I
have a working solution which uses URLMonLibrary>>urlDownload:toFile: to
download the source of the web page, which is then analysed for links to
pictures, which are downloaded and saved as files, also using
#urlDownload:toFile:. Since this solution is a bit complex, especially the
part that identifies links to the pictures, I was wondering if there might
be a better solution to the problem. I have investigated the Internet
Explorer package with it's IWebBrowser class, but have found nothing useful
so far.

Best regards,

Mikael Svane


Reply | Threaded
Open this post in threaded view
|

Re: Saving pictures on a web site from within Dolphin

Christopher J. Demers
"Mikael Svane" <[hidden email]> wrote in message
news:[hidden email]...

>I am writing a small progam that will automatically download pictures from
>a web site (a book that has been scanned and is stored as 358 different
>pictures at 'http://runeberg.org/geodet/0001.html' and so on). Basically,
>the behaviour that I want is the same as one gets by right-clicking on the
>picture in Internet Explorer and choosing "save link as..." but automatic.
>I have a working solution which uses URLMonLibrary>>urlDownload:toFile: to
>download the source of the web page, which is then analysed for links to
>pictures, which are downloaded and saved as files, also using
>#urlDownload:toFile:. Since this solution is a bit complex, especially the
>part that identifies links to the pictures, I was wondering if there might
>be a better solution to the problem. I have investigated the Internet
>Explorer package with it's IWebBrowser class, but have found nothing useful
>so far.

I am not sure about the assumptions one can make, or the generalality of
solution desired.  You could continue to do it in a way simliar to what you
do not, but use a Stream on the page source for parseing, since there is no
need to save it to a file.  You certainly could drive IE via automation, but
unless you need to do lots of parsing or web interaction it may not be worth
using.
==========
htmlStream := (FileStream on: (IStream onURL:
'http://runeberg.org/geodet/0300.html') text: true).
refText := 'alt="scanned image"'.
htmlStream skipToAll: refText.
htmlStream position: htmlStream position - refText size.
[htmlStream pop; peek =$"] whileFalse.
endPos := htmlStream position.
[htmlStream pop; peek =$"] whileFalse.
htmlStream skip: 1.
startPos := htmlStream position.
relativeImageURL := htmlStream next: endPos -startPos.
==========

I took a look at the site, and I see that just as the HTML pages have
predictable file names so do the images.
ex:
http://runeberg.org/img/geodet/0001.5.png
...
http://runeberg.org/img/geodet/0300.5.png

Since you know the format of the url, and the number of pages, why not do
something like this:
============================
firstPageNum := 1.
lastPageNum := 358.
urlCol := (firstPageNum to: lastPageNum) collect: [:pageNum |
    'http://runeberg.org/img/geodet/%04d.5.png' sprintfWith: pageNum].
============================
Then just use your existing code to save all the images in the urlCol.

Chris