Downloading text vs. binary

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
10 messages Options
Reply | Threaded
Open this post in threaded view
|

Downloading text vs. binary

Sean P. DeNigris
Administrator
Zinc seems to think the following is text, but if I download it via Chrome on my Mac, it recognizes it as binary:

        | latestVersionUrl |
        latestVersionUrl := 'http://mirrors.jenkins-ci.org/war/latest/jenkins.war'.
        ZnClient new
        accept: ZnMimeType applicationOctetStream;
                url: latestVersionUrl;
                downloadTo: directory.
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Downloading text vs. binary

Sven Van Caekenberghe-2
Sean,

On 18 Apr 2013, at 22:41, "Sean P. DeNigris" <[hidden email]> wrote:

> Zinc seems to think the following is text, but if I download it via Chrome on
> my Mac, it recognizes it as binary:
>
> | latestVersionUrl |
> latestVersionUrl := 'http://mirrors.jenkins-ci.org/war/latest/jenkins.war'.
> ZnClient new
>        accept: ZnMimeType applicationOctetStream;
> url: latestVersionUrl;
> downloadTo: directory.

Where/how does Zinc think this is text ?

(ZnEasy head: 'http://mirrors.jenkins-ci.org/war/latest/jenkins.war') contentType isBinary  ==> true

I didn't try your statement though.

Sven

PS: you should have warned me that it was so big, 53 Mb, I had to proceed the ZnEntityTooLarge exception ;-)


--
Sven Van Caekenberghe
Proudly supporting Pharo
http://pharo.org
http://association.pharo.org
http://consortium.pharo.org





Reply | Threaded
Open this post in threaded view
|

Re: Downloading text vs. binary

Sean P. DeNigris
Administrator
Sven Van Caekenberghe-2 wrote
Where/how does Zinc think this is text ?
If I add #systemPolicy, and alter the snippet to:
latestVersionUrl := 'http://mirrors.jenkins-ci.org/war/latest/jenkins.war'.
javaArchiveMimeType := ZnMimeType main: 'application' sub: 'java-archive'.
        ZnClient new
                systemPolicy;
        accept: javaArchiveMimeType;
                followRedirects: true;
                url: latestVersionUrl;
                downloadTo: FileLocator imageDirectory.

I get either:
a. /intermittently/ "ZnUnexpectedContentType: expected application/java-archive actual text/plain"
or
b. it downloads, but it shows up in Mac Finder as a text file (kind: TEXT)... not the end of the world, but...
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Downloading text vs. binary

Igor Stasenko
On 18 April 2013 23:10, Sean P. DeNigris <[hidden email]> wrote:

> Sven Van Caekenberghe-2 wrote
>> Where/how does Zinc think this is text ?
>
> If I add #systemPolicy, and alter the snippet to:
> latestVersionUrl := 'http://mirrors.jenkins-ci.org/war/latest/jenkins.war'.
> javaArchiveMimeType := ZnMimeType main: 'application' sub: 'java-archive'.
>         ZnClient new
>                 systemPolicy;
>                 accept: javaArchiveMimeType;
>                 followRedirects: true;
>                 url: latestVersionUrl;
>                 downloadTo: FileLocator imageDirectory.
>
> I get either:
> a. /intermittently/ "ZnUnexpectedContentType: expected
> application/java-archive actual text/plain"

Here server disregards your request that content should be
application/java-archive.
Who to blame? Of course Java! :)

> or
> b. it downloads, but it shows up in Mac Finder as a text file (kind:
> TEXT)... not the end of the world, but...
>
>
>
> -----
> Cheers,
> Sean
> --
> View this message in context: http://forum.world.st/Downloading-text-vs-binary-tp4682396p4682406.html
> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.
>



--
Best regards,
Igor Stasenko.

Reply | Threaded
Open this post in threaded view
|

Re: Downloading text vs. binary

NorbertHartl
In reply to this post by Sean P. DeNigris

Am 18.04.2013 um 23:10 schrieb "Sean P. DeNigris" <[hidden email]>:

> Sven Van Caekenberghe-2 wrote
>> Where/how does Zinc think this is text ?
>
> If I add #systemPolicy, and alter the snippet to:
> latestVersionUrl := 'http://mirrors.jenkins-ci.org/war/latest/jenkins.war'.
> javaArchiveMimeType := ZnMimeType main: 'application' sub: 'java-archive'.
> ZnClient new
> systemPolicy;
>         accept: javaArchiveMimeType;
> followRedirects: true;
> url: latestVersionUrl;
> downloadTo: FileLocator imageDirectory.
>
> I get either:
> a. /intermittently/ "ZnUnexpectedContentType: expected
> application/java-archive actual text/plain"
> or
> b. it downloads, but it shows up in Mac Finder as a text file (kind:
> TEXT)... not the end of the world, but…

Sean,

let me guess, you can't reliable test it. Sometimes it works and sometimes not? This is the situation:

- your url is a redirect to another url
- the redirected url is load balanced so get different server in subsequent requests.
- the guys at dl.aragost.com didn't configure the web server correct. They didn't at a content type handler for .war so it says text/plain

Just do

$ curl -v http://dl.aragost.com//jenkins/war/1.511/jenkins.war -O /tmp/foo

and you can see the text/plain content-type whiel

$ curl -v http://jenkins.mirror.isppower.de/war/1.511/jenkins.war -O /tmp/foo

gives you the java type.

Norbert
Reply | Threaded
Open this post in threaded view
|

Re: Downloading text vs. binary

Sean P. DeNigris
Administrator
Norbert Hartl wrote
- the guys at dl.aragost.com didn't configure the web server correct. They didn't at a content type handler for .war so it says text/plain
Oy! Nice investigation :) The weird thing is that curl seemed to work reliably (albeit only a handful of test runs). I'll use curl for the time being :/
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Downloading text vs. binary

Sven Van Caekenberghe-2

On 19 Apr 2013, at 05:38, "Sean P. DeNigris" <[hidden email]> wrote:

> Norbert Hartl wrote
>> - the guys at dl.aragost.com didn't configure the web server correct. They
>> didn't at a content type handler for .war so it says text/plain
>
> Oy! Nice investigation :) The weird thing is that curl seemed to work
> reliably (albeit only a handful of test runs). I'll use curl for the time
> being :/

Nooooo ;-)

Norbert actually demonstrated with curl that different servers returned different results, so it is/was not the client (either Zinc or curl). If you leave out the #accept: ZnClient won't tell you that you get an unexpected return type, just like curl. The following worked for me (mirror selection is probably location based):

ZnClient new
        systemPolicy;
        logToTranscript;
        url: 'http://mirrors.jenkins-ci.org/war/latest/jenkins.war';
        downloadTo: FileLocator imageDirectory;
        response.

2013-04-19 08:55:39 760189 I Wrote a ZnRequest(GET /war/latest/jenkins.war)
2013-04-19 08:55:39 760189 D Sent headers
Accept: */*
User-Agent: Zinc HTTP Components 1.0
Host: mirrors.jenkins-ci.org

2013-04-19 08:55:39 760189 I Read a ZnResponse(302 Found text/html;charset=iso-8859-1 333B)
2013-04-19 08:55:39 760189 D Received headers
Vary: Accept-Encoding
Content-Length: 333
Connection: close
Set-Cookie: SERVERID=Local; path=/
Content-Type: text/html; charset=iso-8859-1
Date: Fri, 19 Apr 2013 06:55:38 GMT
Location: http://ftp.nluug.nl/programming/jenkins/war/1.511/jenkins.war
X-Mirrorbrain-Realm: region
X-Mirrorbrain-Mirror: ftp.nluug.nl
Server: Apache/2.2.14 (Ubuntu)

2013-04-19 08:55:39 760189 D Redirecting
2013-04-19 08:55:39 760189 D Received cookie: SERVERID=Local; path=/; domain=mirrors.jenkins-ci.org
2013-04-19 08:55:39 760189 I Wrote a ZnRequest(GET /programming/jenkins/war/1.511/jenkins.war)
2013-04-19 08:55:39 760189 D Sent headers
Accept: */*
User-Agent: Zinc HTTP Components 1.0
Host: ftp.nluug.nl

2013-04-19 08:55:39 760189 I Read a ZnResponse(200 OK text/plain;charset=UTF-8 53866436B)
2013-04-19 08:55:39 760189 D Received headers
Date: Fri, 19 Apr 2013 06:55:38 GMT
Content-Length: 53866436
Etag: "8c6442f-335efc4-4da5c37a13200"
Server: Apache/2.2.15 (CentOS)
Accept-Ranges: bytes
Last-Modified: Mon, 15 Apr 2013 01:31:52 GMT
Content-Type: text/plain; charset=UTF-8

2013-04-19 08:55:39 760189 T GET /programming/jenkins/war/1.511/jenkins.war 200 53866436B 349ms


And I tested the .war using jar -tvf so I guess it is OK.

Zinc might fail interpreting binary data as text, as text/plain defaults to UTF-8, but it didn't in the example above while returning/using text/plain; charset=UTF-8

Sven

--
Sven Van Caekenberghe
Proudly supporting Pharo
http://pharo.org
http://association.pharo.org
http://consortium.pharo.org





Reply | Threaded
Open this post in threaded view
|

Re: Downloading text vs. binary

Sean P. DeNigris
Administrator
On Apr 19, 2013, at 3:06 AM, "Sven Van Caekenberghe-2 [via Smalltalk]" <[hidden email]> wrote:
> Nooooo ;-)
Okay, okay, just stop yelling!! ;)

I saw that it was the server, but since it seemed to work consistently w curl I figured they were doing some sort of magic to figure out that it should be binary. Can we have an option to force a certain mime type for these kinds of cases?
Cheers,
Sean
Reply | Threaded
Open this post in threaded view
|

Re: Downloading text vs. binary

Sven Van Caekenberghe-2

On 19 Apr 2013, at 13:08, "Sean P. DeNigris" <[hidden email]> wrote:

> On Apr 19, 2013, at 3:06 AM, "Sven Van Caekenberghe-2 [via Smalltalk]" <[hidden email]> wrote:
> > Nooooo ;-)
> Okay, okay, just stop yelling!! ;)

Good morning to you !

> I saw that it was the server, but since it seemed to work consistently w curl I figured they were doing some sort of magic to figure out that it should be binary. Can we have an option to force a certain mime type for these kinds of cases?

Yes I was considering that but with my test it was not necessary.

There is already a similar option for Zinc server side (always reading incoming entities as binary, used by the Seaside adaptor that wants to do its own conversions).

But it would make everything more complex and uglier, all for fixing a problem of a wrong server…

On the other hand, Zinc should be able to do whatever curl does, so I'll think about it.

> Cheers,
> Sean
>
> View this message in context: Re: Downloading text vs. binary
> Sent from the Pharo Smalltalk Users mailing list archive at Nabble.com.

--
Sven Van Caekenberghe
Proudly supporting Pharo
http://pharo.org
http://association.pharo.org
http://consortium.pharo.org





Reply | Threaded
Open this post in threaded view
|

Re: Downloading text vs. binary

Sean P. DeNigris
Administrator
> But it would make everything more complex and uglier, all for fixing a problem of a wrong server…

I feel your pain... it hurts when our clean beautiful vision encounters the ugly reality of the outside world ;)
Cheers,
Sean