Error when loading from github (failed only on gemstone)

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
22 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Re: Error when loading from github (failed only on gemstone)

Dale Henrichs-3
Now I’m on my phone ... now that we’re seeing consistent failures I agree that the size of the file is most likely the main issue ... I think it will take a reading if the zip file spec to get to the bottom of this one ... it is suspicious that a number of fields are ignored when reading the “central directory” and my first suspicion is that we’re losing important info there (don’t have code at my fingertips) ... since we’re off by only a few bytes the file must have just reached an important threshold of some sort .... I need to get some rest now ... i should be able to take up the thread tomorrow morning ... thanks for double checking ...

Dale

Sent from my iPhone

On Dec 18, 2017, at 6:19 PM, Mariano Martinez Peck <[hidden email]> wrote:

Hi Dale...

Back with my computer now. WTF... I tried both snippets in my original images as well as in imaged from scratch (both in 6.1) and it fails too!!!
The only explanation I can find is that the first time I build the image the zipball might have worked (maybe it was smaller) and after that, the rest of the loads happened via Iceberg. 
Another possible related thing is that in  all my images I share the same package cache and probably github-cache etc and that may also decrease the chances of having to re-download the zipball. 

But as said, I just tried re-loading in my original images and also from scratch and I can see the problem. The bad news is that the problem is even worst as now I cannot even load it in Pharo. The good news is that it's consistent with GemStone.

At this point in time(with your investigation), I don't think its worth exploring why it worked for me at certain point in Pharo 6.1. We have a reproducible test case that fails on both platforms and with a "default" image, right?
And we suspect that something related to file size is involved because `master` branch does work. If you want I can also send an email to Pharo mailing list to see if there is someone else willing to give us a hand. 

Curl is able to download:


 ❯ ls -lah highcharts.zip                                                                 [23:18:06]
-rw-r--r-- 1 mariano staff 79M Dec 18 23:14 highcharts.zip

and zip seems valid:

 ❯ unzip -q -t highcharts.zip                                                             [23:18:33]
No errors detected in compressed data of highcharts.zip.


What do you think? Thanks for pushing it!


On Mon, Dec 18, 2017 at 10:31 PM, Dale Henrichs <[hidden email]> wrote:

Also pharo6.1 uses a different pharo vm ....


On 12/18/17 5:29 PM, Dale Henrichs wrote:

Geez ... pharo6.0 and pharo6.1 are very different AFAIK ... iceberg is loaded in 6.1 while I assume that the code path for downloading github zip files is same for pharo6.1, ... there is the funky "enable metacello" stuff needed(?) when iceberg is being used and I do not know what kind of monkey business is involved there .... I just know that at ESUG the enable Metcello stuff broke Metacello is several different ways that did not make sense ...

Before I spend anymore time on this could you confirm that whether or not this expression works in your pharo image ... and share the exact version information about what you using:

    outputFileName := '/tmp/test.zip'.
    ZnClient new
        url: 'https://github.com/ba-st/HighchartsSt/zipball/highchart6-import';
        downloadTo: outputFileName.
    ^ ZipArchive new
        readFrom: outputFileName asFileReference

Then set a halt in this method[1] in your image and run this metacello expression:

  Metacello new
    baseline: 'HighchartsSt';
    repository: 'github://ba-st/HighchartsSt:highchart6-import/repository';
    get

and see if you hit the halt ... if you do hit the halt, confirm that you get the same result as from the first expression with ZnClient/ZipArchive ...

if you don't hit the halt then figure our what code is being used in your image  to download the file ...

Remember, if ZipArchive is being used and it works you image, then there's a chance that the ZipArchive patch (which is not present in Pharo6.0) can be applied to the GemStone code ... if not, then a patch will depend upon what you learn about the code path actually being used for Pharo6.1 ...

It's unfortunate that I didn't know you were using pharo6.1 earlier in the day:)

Dale

[1] https://github.com/Metacello/metacello/blob/master/repository/Metacello-Platform.pharo60.package/MetacelloPharoPlatform.class/instance/downloadZipArchive.to..st
On 12/18/17 5:13 PM, Mariano Martinez Peck wrote:
Uhhh good point. I am using 6.1 but I guess that shouldn't change much right? 

On Dec 18, 2017 10:02 PM, "Dale Henrichs" <[hidden email]> wrote:

but I am trying to understand why you are able to load it cleanly in Pharo6.0 when my tests show that it should not load cleanly in pharo 6.0 either ... the metacello get expression in Pharo6.0 ends up calling the ZnClient/ZipArchive code that fails for me ... so I think that either the ZipArchive bug was fixed in _your_ version of Pharo 6.0 or ... we are comparing apples to oranges and your claim that "it works in pharo" is not necessary correct ... if the bug is fixed in ZipArchive then it can be fixed in GemStone if not ... then I assume that you will find that the github load will fail in pharo as well --- as it does for me ...

Dale


On 12/18/17 4:51 PM, Mariano Martinez Peck wrote:
Again from phone sorry....

Yes, we are using a special branch for highcharts 6 in which have a lot more of auto generated code than the the master. That's the difference in the zip files and kind of confirms my suspicious that zip file size is the key point of the bug. 

On Dec 18, 2017 9:42 PM, "Dale Henrichs" <[hidden email]> wrote:

Mariano,

The url that you used in your earlier email: 'https://github.com/ba-st/HighchartsSt/zipball/highchart6-import' is not the url that I get when I try to do to the following:

  Metacello new
    baseline: 'HighchartsSt';
    repository: 'github://ba-st/HighchartsSt:master/repository';
    get

Which btw worked for me when executing the expression for both GemStone and my version of Pharo6.0 .... on the mac ... and the size of the two zip files differs greatly:

 successful downloaded by ZinClient for pharo ('github://ba-st/HighchartsSt:master/repository'):

   -rw-r--r--@  1 dhenrich  wheel   9429194 Dec 18 16:26 github--bastHighchartsStmaster437317039036239984103512910491.zip

 successful ownloaded by curl ('github://ba-st/HighchartsSt:master/repository'):

  -rw-r--r--   1 dhenrich  wheel   9429194 Dec 18 16:28 github-34040-bastHighchartsStmaster.zip

Contrast that with the size of the file for the hichchart6-import url:

  -rw-r--r--   1 dhenrich  wheel  82456389 Dec 18 12:14 github-3574-bastHighchartsSthighchart6import.zip

Which cannot be handled correctly by wither Pharo6 or GemStone ... and I've verified that the ZnClient/ZipArchive combo is being used by my version of Pharo6.0 and looks like the latest version that should be used by 6.0 from the github repo[1]...

So right now I'm inclined to think that we're comparing apples and oranges sizne the urls are very different and the size of the downloads are very different as well ... oh now I see that you are not using the master branch ... so I've wasted a bit of time going down this path ...

I'll retest everthing yet again using your branch and see what I find out ...

Dale

[1] https://github.com/Metacello/metacello/blob/master/repository/Metacello-Platform.pharo60.package/MetacelloPharoPlatform.class/instance/downloadZipArchive.to..st


On 12/18/17 2:45 PM, Mariano Martinez Peck wrote:
Answering from phone... Yes it did work in GemStone but not the zip archive but the sender download to from Metacello platform . Which as you concluded it
Does something different.. I think it was using zinc streams.. I did not try the isolated zip stream code in Pharo as you just said. 

So I agree it must be a zip stream bug in both dialects. 

Thanks 

On Dec 18, 2017 7:33 PM, "Dale Henrichs" <[hidden email]> wrote:
You said it worked perfectly in pharo6.0 but I am getting the same error as GemSTone running this code in  Pharo60:

    outputFileName := '/tmp/test.zip'.
    ZnClient new
        url: 'https://github.com/ba-st/HighchartsSt/zipball/highchart6-import';
        downloadTo: outputFileName.
    ^ ZipArchive new
        readFrom: outputFileName asFileReference

So the bug _is_ in ZipArchive and seems to exist is Pharo6 and GEmStone ... this is the Pharo6.0 I am running:

  Pharo 6.0
  Latest update: #60319

So perhaps the ZipARchive bug has been recently fixed or ???

Dale

On 12/18/17 2:21 PM, Dale Henrichs wrote:

I just tried reading the zip file using the Pharo6 ZipArchive code and I get the same error as in GemStone ... so the error is in the (old) ZipArchive reader code in Pharo6 as well as GemStone ... Pharo6 is probably using something other than the ZipArchive reader for cracking zip files these days ...

Dale


On 12/18/17 1:51 PM, Dale Henrichs wrote:

Hmmm .... replaced RWBinaryOrTextStream with FileStreamPortable ... which BTW should do better memory wise, since FileStreamPortable does not load the entire file into memory, but instead depends up on the GsFile implementation for buffering from disk ... but ended up with same error at the same spot in the file so I don't think the error is related to String/UTF8 errors ... sure enough when looking at it a bit closer, we have yet to read a single member from the the zip file, so the bookkeeping calculations must be allo wrong ... shouldn't be too hard to find at this point ...

Dale


On 12/18/17 1:24 PM, Dale Henrichs wrote:

Hmmm an off by 6 error .... 6 bytes short of the signatures ... 6 bytes in 70M is not a normal off-by error ... We're storing the itsCollection as a String and the zipped contents should actually be UTF8 so I am wondering if there is an 8 byte character issue involved here somewhere???


--------------------
.              -> aRWBinaryOrTextStream
..             -> aZipArchive
(class)@       -> RWBinaryOrTextStream
(oop)@         -> 255126529
isBinary@      -> true
itsCollection@ -> 'PK & K ba-st-HighchartsSt-c9b4696/UT Ø[2ZPK & KW û ¿ c * ba-st-HighchartsSt-c9b4696/.smalltalk.stonU. . .'
name@          -> nil
position@      -> 54469880

------------------------
self copyFrom: 54469880 - 10  to: 54469880 + 10

(bytes)@ -> #[ 57 54 47 85 84 5 0 1 216 91 50 90 80 75 1 2 0 0 10 0 0]
signatureData                   ^
CentralDirectoryFileHeaderSignature              ^
54469880                                   ^

signatureData                                 -> #[ 0 1 216 91]
CentralDirectoryFileHeaderSignature           -> #[ 80 75 1 2]
On 12/18/17 12:27 PM, Mariano Martinez Peck wrote:
Thanks Dale! Yeah, I had also the impression that it was NOT due to GemStone 3.4. My gut feeling is that there is a bug in the ZipArchive code but that only arrives with large zips.

Cheers, 

On Mon, Dec 18, 2017 at 5:25 PM, Dale Henrichs <[hidden email]> wrote:

Okay I get the same error when running with 3.3.6 and a large TOC .... so I will see what I can figure out where the bug is ...

Dale


On 12/18/17 12:20 PM, Dale Henrichs wrote:

Sorry for the delay in getting back to you ... I've been down with the flue for the last week and saw your mail just today ... not feeling that great today either, but I'll see what I can figure out ...

My first data point is that the difference betwee CentOs and Mac appears to be simply related to how much TOC you've got allocated on each machine ... after bumping up the TOC on the mac I get exactly the same error ... this is for 3.4.0 right now ... I'll give it a try on 3.3.x and see if I get the same error ... but right now it certainly looks like a bug in the zip reader code...

Dale


On 12/14/17 4:43 AM, Mariano Martinez Peck wrote:
Hi Dale, 

I am trying to load HighchartsSt as a dependency with Metacello. This worked perfectly on Pharo but when I try on GemStone I am getting an error. The way to reproduce it for me is as simple as this:

MetacelloGemStonePlatform new downloadZipArchive: 'https://github.com/ba-st/HighchartsSt/zipball/highchart6-import' to: '/tmp/github-3574-bastHighchartsSthighchart6import.zip'

And this yields me the attached error.

I don't think Github is giving me broken zips. Furthermore, if I do a `wget` then `unzip` it does work correct.  I am trying to check if this fails in older GemStone than 3.4 but I am failing to start my old stones too grrrrr...

Thanks in advance,

--
--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--
--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.




--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.


--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.



--

--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
Reply | Threaded
Open this post in threaded view
|

Re: Error when loading from github (failed only on gemstone)

Dale Henrichs-3
In reply to this post by Mariano Martinez Peck

Mariano,

I'm not quite sure why you are relying on zip downloads from github for your production builds ... you should be able to clone the HighChartsSt project to your local disk and rely on a Metacello lock to ensure that the project is loaded from your local clone ... you should do this for all of the github projects that you are using ...

Using local clones and Metacello lock is definitely superior than attempting to hack around with alternate schemes for dealing with zip files .... the github project is git-based after all ...

Dale


On 12/15/17 1:17 PM, Mariano Martinez Peck wrote:
With the usage of git, its likely that within our project, aside from the CODE repository subdirectory, we also store other none code stuff. And with Metacello/Monticello we must download a zipball of the whole project, and not only the "code subdirectory", which might be too big. In fact, even the code subdirectory much be too big like in this case I am having.  

With that in mind, I don't like having to unzip everything all together in memory. In `downloadZipArchive:to:` you can see: 

  zipfile containingDirectory
    readOnlyFileNamed: zipfile localName
    do: [ :fileStream | stream := RWBinaryOrTextStream on: fileStream contents ].

Where `fileStream contents` brings everything into the image.

So.... my question is ... in the same way that this methods does a `performOnServer:` with `curl`


downloadBasicFile: url to: outputFileName includeHTTPHeader: includeHTTPHeader username: username pass: pass
  "download from <url> into <outputFileName>"

  | errorFileName args |
  "Make a unique name for the error log and ensure that we can write the files"
  errorFileName := self downloadErrorFileNameFor: outputFileName.
  (ServerFileDirectory on: outputFileName) forceNewFileNamed: outputFileName.
  (ServerFileDirectory on: errorFileName) forceNewFileNamed: errorFileName.
  args := username ifNotNil: [ ' -u ' , username , ':' , pass ] ifNil: [ '' ].
  includeHTTPHeader
    ifTrue: [ args := args , ' -i ' ].
  System
    performOnServer:
      '/usr/bin/curl' , args , ' -L ' , url , ' > ' , outputFileName , ' 2> '
        , errorFileName.
  ^ errorFileName


Can't we adapt the previous method to do a `performOnServer:` to do a `unzip` ? 






On Thu, Dec 14, 2017 at 11:13 AM, Mariano Martinez Peck <[hidden email]> wrote:
I forgot to say that the zip is quite big (70MB) as we have a bunch of autogenerated code and filetree is a bit verbose.. 
I would also like to add that that was on GemStone 3.4 on CentOS 7. 

On GemStone 3.3.3 on OSX it also failed but it failed LATER (trying to signal an out of memory when trying to get the `fileStream contents`). See attachment. 

Thanks,
 


On Thu, Dec 14, 2017 at 9:43 AM, Mariano Martinez Peck <[hidden email]> wrote:
Hi Dale, 

I am trying to load HighchartsSt as a dependency with Metacello. This worked perfectly on Pharo but when I try on GemStone I am getting an error. The way to reproduce it for me is as simple as this:

MetacelloGemStonePlatform new downloadZipArchive: 'https://github.com/ba-st/HighchartsSt/zipball/highchart6-import' to: '/tmp/github-3574-bastHighchartsSthighchart6import.zip'

And this yields me the attached error.

I don't think Github is giving me broken zips. Furthermore, if I do a `wget` then `unzip` it does work correct.  I am trying to check if this fails in older GemStone than 3.4 but I am failing to start my old stones too grrrrr...

Thanks in advance,

--



--



--
--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.

--
You received this message because you are subscribed to the Google Groups "Metacello" group.
To unsubscribe from this group and stop receiving emails from it, send an email to [hidden email].
For more options, visit https://groups.google.com/d/optout.
12