Travis build failing because of bad CRC

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
23 messages Options
12
Reply | Threaded
Open this post in threaded view
|

Travis build failing because of bad CRC

Julien Delplanque-2
Hello,

I’m trying to set up continuous integration for one of my project but I have an error with the CRC of the VM (see the log https://travis-ci.org/juliendelplanque/PostgreSQLParser/jobs/304401690).

Is it my fault?

Thanks in advance,

Julien
Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

CyrilFerlicot
Le 19/11/2017 à 19:54, Julien a écrit :
> Hello,
>
> I’m trying to set up continuous integration for one of my project but I
> have an error with the CRC of the VM (see the
> log https://travis-ci.org/juliendelplanque/PostgreSQLParser/jobs/304401690).
>
> Is it my fault?
>

Not your fault. It happen also randomly on Synectique's CI.

> Thanks in advance,
>
> Julien


--
Cyril Ferlicot
https://ferlicot.fr

http://www.synectique.eu
2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France


signature.asc (836 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

Julien Delplanque-2
So the fix is just to relaunch the build? :-)

Julien

> Le 19 nov. 2017 à 19:56, Cyril Ferlicot D. <[hidden email]> a écrit :
>
> Le 19/11/2017 à 19:54, Julien a écrit :
>> Hello,
>>
>> I’m trying to set up continuous integration for one of my project but I
>> have an error with the CRC of the VM (see the
>> log https://travis-ci.org/juliendelplanque/PostgreSQLParser/jobs/304401690).
>>
>> Is it my fault?
>>
>
> Not your fault. It happen also randomly on Synectique's CI.
>
>> Thanks in advance,
>>
>> Julien
>
>
> --
> Cyril Ferlicot
> https://ferlicot.fr
>
> http://www.synectique.eu
> 2 rue Jacques Prévert 01,
> 59650 Villeneuve d'ascq France
>


Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

Stephane Ducasse-3
Marcus told me that he wants to address this problem because this is
super annoying.

Stef

On Sun, Nov 19, 2017 at 7:57 PM, Julien <[hidden email]> wrote:

> So the fix is just to relaunch the build? :-)
>
> Julien
>
>> Le 19 nov. 2017 à 19:56, Cyril Ferlicot D. <[hidden email]> a écrit :
>>
>> Le 19/11/2017 à 19:54, Julien a écrit :
>>> Hello,
>>>
>>> I’m trying to set up continuous integration for one of my project but I
>>> have an error with the CRC of the VM (see the
>>> log https://travis-ci.org/juliendelplanque/PostgreSQLParser/jobs/304401690).
>>>
>>> Is it my fault?
>>>
>>
>> Not your fault. It happen also randomly on Synectique's CI.
>>
>>> Thanks in advance,
>>>
>>> Julien
>>
>>
>> --
>> Cyril Ferlicot
>> https://ferlicot.fr
>>
>> http://www.synectique.eu
>> 2 rue Jacques Prévert 01,
>> 59650 Villeneuve d'ascq France
>>
>
>

Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

NorbertHartl
This more than super annoying. And it is as bad for smalltalkhub as it is for the files.pharo.org server

> Am 22.11.2017 um 21:35 schrieb Stephane Ducasse <[hidden email]>:
>
> Marcus told me that he wants to address this problem because this is
> super annoying.
>
> Stef
>
>> On Sun, Nov 19, 2017 at 7:57 PM, Julien <[hidden email]> wrote:
>> So the fix is just to relaunch the build? :-)
>>
>> Julien
>>
>>> Le 19 nov. 2017 à 19:56, Cyril Ferlicot D. <[hidden email]> a écrit :
>>>
>>> Le 19/11/2017 à 19:54, Julien a écrit :
>>>> Hello,
>>>>
>>>> I’m trying to set up continuous integration for one of my project but I
>>>> have an error with the CRC of the VM (see the
>>>> log https://travis-ci.org/juliendelplanque/PostgreSQLParser/jobs/304401690).
>>>>
>>>> Is it my fault?
>>>>
>>>
>>> Not your fault. It happen also randomly on Synectique's CI.
>>>
>>>> Thanks in advance,
>>>>
>>>> Julien
>>>
>>>
>>> --
>>> Cyril Ferlicot
>>> https://ferlicot.fr
>>>
>>> http://www.synectique.eu
>>> 2 rue Jacques Prévert 01,
>>> 59650 Villeneuve d'ascq France
>>>
>>
>>

Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

Marcus Denker-4
We need to exit Inria and do our own infrastructure. The problem is that this will take quite some
time and effort.

        Marcus

> On 22 Nov 2017, at 22:49, Norbert Hartl <[hidden email]> wrote:
>
> This more than super annoying. And it is as bad for smalltalkhub as it is for the files.pharo.org server
>
>> Am 22.11.2017 um 21:35 schrieb Stephane Ducasse <[hidden email]>:
>>
>> Marcus told me that he wants to address this problem because this is
>> super annoying.
>>
>> Stef
>>
>>> On Sun, Nov 19, 2017 at 7:57 PM, Julien <[hidden email]> wrote:
>>> So the fix is just to relaunch the build? :-)
>>>
>>> Julien
>>>
>>>> Le 19 nov. 2017 à 19:56, Cyril Ferlicot D. <[hidden email]> a écrit :
>>>>
>>>> Le 19/11/2017 à 19:54, Julien a écrit :
>>>>> Hello,
>>>>>
>>>>> I’m trying to set up continuous integration for one of my project but I
>>>>> have an error with the CRC of the VM (see the
>>>>> log https://travis-ci.org/juliendelplanque/PostgreSQLParser/jobs/304401690).
>>>>>
>>>>> Is it my fault?
>>>>>
>>>>
>>>> Not your fault. It happen also randomly on Synectique's CI.
>>>>
>>>>> Thanks in advance,
>>>>>
>>>>> Julien
>>>>
>>>>
>>>> --
>>>> Cyril Ferlicot
>>>> https://ferlicot.fr
>>>>
>>>> http://www.synectique.eu
>>>> 2 rue Jacques Prévert 01,
>>>> 59650 Villeneuve d'ascq France
>>>>
>>>
>>>
>


Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

demarey

> Le 23 nov. 2017 à 09:08, Marcus Denker <[hidden email]> a écrit :
>
> We need to exit Inria and do our own infrastructure.

Marcus, could you elaborate? Why do we need to exit Inria?
Do you have money to pay the infrastructure? Inria provides it for free for the Pharo consortium.
Of course, the problem is super annoying but the problem is not directly related to Inria. It is related to Renater (national research and education network in France), the network Inria is connected to. An incident is open there. I still have no feedback, so cannot say why it takes so much time to solve the problem.
Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

Marcus Denker-4


> On 23 Nov 2017, at 09:33, Christophe Demarey <[hidden email]> wrote:
>
>
>> Le 23 nov. 2017 à 09:08, Marcus Denker <[hidden email]> a écrit :
>>
>> We need to exit Inria and do our own infrastructure.
>
> Marcus, could you elaborate? Why do we need to exit Inria?

Because it does not work.

> Do you have money to pay the infrastructure? Inria provides it for free for the Pharo consortium.
> Of course, the problem is super annoying but the problem is not directly related to Inria. It is related to Renater (national research and education network in France), the network Inria is connected to. An incident is open there. I still have no feedback, so cannot say why it takes so much time to solve the problem.

That alone is hard explain to anyone.

        Marcus
Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

NorbertHartl
In reply to this post by demarey
Christophe,

> Am 23.11.2017 um 09:33 schrieb Christophe Demarey <[hidden email]>:
>
>
>> Le 23 nov. 2017 à 09:08, Marcus Denker <[hidden email]> a écrit :
>>
>> We need to exit Inria and do our own infrastructure.
>
> Marcus, could you elaborate? Why do we need to exit Inria?
> Do you have money to pay the infrastructure? Inria provides it for free for the Pharo consortium.
> Of course, the problem is super annoying but the problem is not directly related to Inria. It is related to Renater (national research and education network in France), the network Inria is connected to. An incident is open there. I still have no feedback, so cannot say why it takes so much time to solve the problem.

your view might be too technical here. Aren't you the one that does Cargo so dependencies are well known to you? ;)
The point is that it just does not work and it doesn't matter whose fault this is. The point is that it is there for a long time and nobody way able to fix it. Everyone including you is working so hard on the future for pharo. But this behaviour is like a slap in the face of everyone working for pharo. You drive away newcomers and people that would like to work with pharo. It is not possible to download an image or vm or something from smalltalkhub reliably.
If this is not fully clear let me tell you what I did yesteray. I resurrected an older project and ported it pharo6. In order to download the image I had the execute the curl command 6 times in a row and another two times to get a vm. Then I tried loading the project with metacello and get a debugger because the downloaded archive is corrupt. I need to open the full debugger to see which file it is. Then I need to remove the corrupt file from the package-cache and restart. And yesterdayI had to do this more then 30 (!!!) times until I could load the full project.

I think it is completely out of question to wait until is able to fix it. And if the money of the consortium is not enough to pay such a service then we need to put in some more. I'm all ears!

Norbert

Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

demarey
Hi Norbert,

I understand your point of view that others probably share.
I also agree the situation is very bad: Inria took too much time to investigate the problem and now, renater also …
The question is: would it be really better outside Inria? Maybe ... maybe not …
Anyway, there is still an incident open at renater and gtt. My contact will push today to see if we can have an solution soon

> Le 23 nov. 2017 à 10:19, Norbert Hartl <[hidden email]> a écrit :
>
> Christophe,
>
>> Am 23.11.2017 um 09:33 schrieb Christophe Demarey <[hidden email]>:
>>
>>
>>> Le 23 nov. 2017 à 09:08, Marcus Denker <[hidden email]> a écrit :
>>>
>>> We need to exit Inria and do our own infrastructure.
>>
>> Marcus, could you elaborate? Why do we need to exit Inria?
>> Do you have money to pay the infrastructure? Inria provides it for free for the Pharo consortium.
>> Of course, the problem is super annoying but the problem is not directly related to Inria. It is related to Renater (national research and education network in France), the network Inria is connected to. An incident is open there. I still have no feedback, so cannot say why it takes so much time to solve the problem.
>
> your view might be too technical here. Aren't you the one that does Cargo so dependencies are well known to you? ;)
> The point is that it just does not work and it doesn't matter whose fault this is. The point is that it is there for a long time and nobody way able to fix it. Everyone including you is working so hard on the future for pharo. But this behaviour is like a slap in the face of everyone working for pharo. You drive away newcomers and people that would like to work with pharo. It is not possible to download an image or vm or something from smalltalkhub reliably.
> If this is not fully clear let me tell you what I did yesteray. I resurrected an older project and ported it pharo6. In order to download the image I had the execute the curl command 6 times in a row and another two times to get a vm. Then I tried loading the project with metacello and get a debugger because the downloaded archive is corrupt. I need to open the full debugger to see which file it is. Then I need to remove the corrupt file from the package-cache and restart. And yesterdayI had to do this more then 30 (!!!) times until I could load the full project.
>
> I think it is completely out of question to wait until is able to fix it. And if the money of the consortium is not enough to pay such a service then we need to put in some more. I'm all ears!
>
> Norbert
>


Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

NorbertHartl


> Am 23.11.2017 um 11:39 schrieb Christophe Demarey <[hidden email]>:
>
> Hi Norbert,
>
> I understand your point of view that others probably share.
> I also agree the situation is very bad: Inria took too much time to investigate the problem and now, renater also …
> The question is: would it be really better outside Inria? Maybe ... maybe not …

Maybe that is the point. It is not a question if it works better outside because it will. The problem we have is so serious that it will be hard to find elsewhere. I can only repeat: It is not the download that fails which TCP wise means that exactly the thing is downloaded that inria offered. So something below the web server is broken and most probably they have a corrupt storage solution meaning only if you give it broken to the web server the broken thing can be transported in a sane manner.

We cannot rely on such broken things so we need another solution even if it is temporary.

So no question.

Norbert

> Anyway, there is still an incident open at renater and gtt. My contact will push today to see if we can have an solution soon
>
>> Le 23 nov. 2017 à 10:19, Norbert Hartl <[hidden email]> a écrit :
>>
>> Christophe,
>>
>>> Am 23.11.2017 um 09:33 schrieb Christophe Demarey <[hidden email]>:
>>>
>>>
>>>> Le 23 nov. 2017 à 09:08, Marcus Denker <[hidden email]> a écrit :
>>>>
>>>> We need to exit Inria and do our own infrastructure.
>>>
>>> Marcus, could you elaborate? Why do we need to exit Inria?
>>> Do you have money to pay the infrastructure? Inria provides it for free for the Pharo consortium.
>>> Of course, the problem is super annoying but the problem is not directly related to Inria. It is related to Renater (national research and education network in France), the network Inria is connected to. An incident is open there. I still have no feedback, so cannot say why it takes so much time to solve the problem.
>>
>> your view might be too technical here. Aren't you the one that does Cargo so dependencies are well known to you? ;)
>> The point is that it just does not work and it doesn't matter whose fault this is. The point is that it is there for a long time and nobody way able to fix it. Everyone including you is working so hard on the future for pharo. But this behaviour is like a slap in the face of everyone working for pharo. You drive away newcomers and people that would like to work with pharo. It is not possible to download an image or vm or something from smalltalkhub reliably.
>> If this is not fully clear let me tell you what I did yesteray. I resurrected an older project and ported it pharo6. In order to download the image I had the execute the curl command 6 times in a row and another two times to get a vm. Then I tried loading the project with metacello and get a debugger because the downloaded archive is corrupt. I need to open the full debugger to see which file it is. Then I need to remove the corrupt file from the package-cache and restart. And yesterdayI had to do this more then 30 (!!!) times until I could load the full project.
>>
>> I think it is completely out of question to wait until is able to fix it. And if the money of the consortium is not enough to pay such a service then we need to put in some more. I'm all ears!
>>
>> Norbert
>>

Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

demarey

> Le 23 nov. 2017 à 12:34, Norbert Hartl <[hidden email]> a écrit :
>
>
>
>> Am 23.11.2017 um 11:39 schrieb Christophe Demarey <[hidden email]>:
>>
>> Hi Norbert,
>>
>> I understand your point of view that others probably share.
>> I also agree the situation is very bad: Inria took too much time to investigate the problem and now, renater also …
>> The question is: would it be really better outside Inria? Maybe ... maybe not …
>
> Maybe that is the point. It is not a question if it works better outside because it will. The problem we have is so serious that it will be hard to find elsewhere. I can only repeat: It is not the download that fails which TCP wise means that exactly the thing is downloaded that inria offered. So something below the web server is broken and most probably they have a corrupt storage solution meaning only if you give it broken to the web server the broken thing can be transported in a sane manner.

I’m not confident with the diagnostic.
I never encountered this problem and we got no such feedback from people in the team. So it is really strange. Indeed, if CRC is worn it does not mean slowness but corrupted data … I do not see any reason that we do have CRC from outside Inria but not inside.
If no solution is found today or tomorrow, we will probably see to host a solution outside Inria.
Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

NorbertHartl


Am 23.11.2017 um 13:18 schrieb Christophe Demarey <[hidden email]>:


Le 23 nov. 2017 à 12:34, Norbert Hartl <[hidden email]> a écrit :



Am 23.11.2017 um 11:39 schrieb Christophe Demarey <[hidden email]>:

Hi Norbert,

I understand your point of view that others probably share.
I also agree the situation is very bad: Inria took too much time to investigate the problem and now, renater also …
The question is: would it be really better outside Inria? Maybe ... maybe not …

Maybe that is the point. It is not a question if it works better outside because it will. The problem we have is so serious that it will be hard to find elsewhere. I can only repeat: It is not the download that fails which TCP wise means that exactly the thing is downloaded that inria offered. So something below the web server is broken and most probably they have a corrupt storage solution meaning only if you give it broken to the web server the broken thing can be transported in a sane manner. 

I’m not confident with the diagnostic.
I never encountered this problem and we got no such feedback from people in the team. So it is really strange. Indeed, if CRC is worn it does not mean slowness but corrupted data … I do not see any reason that we do have CRC from outside Inria but not inside.
If no solution is found today or tomorrow, we will probably see to host a solution outside Inria.

If you are not confident about the diagnosis that's fine. But then it would be cool to come up with a better one and not giving that "works for me". That is not an explanation. Because your assumption would be that externally and internally you use the same system and transport ways. And that is most likely not true. 
Just try to find an explanation. Here we go:

- If I download a zip file from files.pharo.org I open a TCP connection from my laptop to files.pharo.org
- TCP has a checksum in the header that is used to check the integrity of each transferred package
- So I can assume that this packet of the one connection was not tampered
- If it counts for a single packet it counts for the whole file
- With all my assumption I can state that I get exactly what the files.pharo.org is sending me

So what are the possibilities it fails?

- Either the stack beneath hands out the data to the TCP stack delivers corrupt data. That could be the filesystem hence my thought
- Or there is no single connection between my laptop and files.pharo.org. In that case the data would flow over a proxy. But even if I have two connections and for each the integrity check counts then what I said for one connection counts for two and more as well
- If we really decline beliving the filesystem fails (that is less likely if it works inside of inria)
- Is there anything left? Yes, if the proxy modifies the data you can have data integrity on the first connection, then a program that deliberately changes the content and transfers that integer again over the second connection. So it can be broken on the other end.

If we then look at the file we might recognize that it has the ending .zip and this is the file type that transfers most viruses over the net. So we even can imagine what kind of proxy that might be, let's call it a virus checker. That is what I would do in order to narrow the problem.

Norbert

Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

Tim Mackinnon
In reply to this post by demarey
I think the days of running this stuff yourselves are disappearing… I think many OSS projects rely on SAS for this now - outsource it. We already do Travis - and maybe can leverage that more (but it seems a bit cumbersome) - OR there are fuller service solutions like what AWS does, or GitLab CI - but they will take time to migrate to.

Tim


On 23 Nov 2017, at 12:18, Christophe Demarey <[hidden email]> wrote:


Le 23 nov. 2017 à 12:34, Norbert Hartl <[hidden email]> a écrit :



Am 23.11.2017 um 11:39 schrieb Christophe Demarey <[hidden email]>:

Hi Norbert,

I understand your point of view that others probably share.
I also agree the situation is very bad: Inria took too much time to investigate the problem and now, renater also …
The question is: would it be really better outside Inria? Maybe ... maybe not …

Maybe that is the point. It is not a question if it works better outside because it will. The problem we have is so serious that it will be hard to find elsewhere. I can only repeat: It is not the download that fails which TCP wise means that exactly the thing is downloaded that inria offered. So something below the web server is broken and most probably they have a corrupt storage solution meaning only if you give it broken to the web server the broken thing can be transported in a sane manner. 

I’m not confident with the diagnostic.
I never encountered this problem and we got no such feedback from people in the team. So it is really strange. Indeed, if CRC is worn it does not mean slowness but corrupted data … I do not see any reason that we do have CRC from outside Inria but not inside.
If no solution is found today or tomorrow, we will probably see to host a solution outside Inria.

Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

Stephane Ducasse-3
In reply to this post by NorbertHartl
This is super strange because I have a quite bad internet connection
at home but I never have such problem just super slow donwload
but this is my connection.

Stef

On Thu, Nov 23, 2017 at 10:19 AM, Norbert Hartl <[hidden email]> wrote:

> Christophe,
>
>> Am 23.11.2017 um 09:33 schrieb Christophe Demarey <[hidden email]>:
>>
>>
>>> Le 23 nov. 2017 à 09:08, Marcus Denker <[hidden email]> a écrit :
>>>
>>> We need to exit Inria and do our own infrastructure.
>>
>> Marcus, could you elaborate? Why do we need to exit Inria?
>> Do you have money to pay the infrastructure? Inria provides it for free for the Pharo consortium.
>> Of course, the problem is super annoying but the problem is not directly related to Inria. It is related to Renater (national research and education network in France), the network Inria is connected to. An incident is open there. I still have no feedback, so cannot say why it takes so much time to solve the problem.
>
> your view might be too technical here. Aren't you the one that does Cargo so dependencies are well known to you? ;)
> The point is that it just does not work and it doesn't matter whose fault this is. The point is that it is there for a long time and nobody way able to fix it. Everyone including you is working so hard on the future for pharo. But this behaviour is like a slap in the face of everyone working for pharo. You drive away newcomers and people that would like to work with pharo. It is not possible to download an image or vm or something from smalltalkhub reliably.
> If this is not fully clear let me tell you what I did yesteray. I resurrected an older project and ported it pharo6. In order to download the image I had the execute the curl command 6 times in a row and another two times to get a vm. Then I tried loading the project with metacello and get a debugger because the downloaded archive is corrupt. I need to open the full debugger to see which file it is. Then I need to remove the corrupt file from the package-cache and restart. And yesterdayI had to do this more then 30 (!!!) times until I could load the full project.
>
> I think it is completely out of question to wait until is able to fix it. And if the money of the consortium is not enough to pay such a service then we need to put in some more. I'm all ears!
>
> Norbert
>

Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

CyrilFerlicot

On jeu. 23 nov. 2017 at 23:49, Stephane Ducasse <[hidden email]> wrote:
This is super strange because I have a quite bad internet connection
at home but I never have such problem just super slow donwload
but this is my connection.

At Synectique we have a connection that can DL at 11Mo/sec on a server, but we regularly get "bad CRC" while using zero conf. Or one execution of zero conf is randomly longer than the others. It does not seems to come from the server/user connection. 


Stef



--
Cyril Ferlicot
https://ferlicot.fr

http://www.synectique.eu
2 rue Jacques Prévert 01,
59650 Villeneuve d'ascq France
Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

Stephane Ducasse-3
In reply to this post by NorbertHartl
I like your analysis and I share your frustration.
And this is a so bad press for inria.

Stef

On Thu, Nov 23, 2017 at 2:49 PM, Norbert Hartl <[hidden email]> wrote:

>
>
> Am 23.11.2017 um 13:18 schrieb Christophe Demarey
> <[hidden email]>:
>
>
> Le 23 nov. 2017 à 12:34, Norbert Hartl <[hidden email]> a écrit :
>
>
>
> Am 23.11.2017 um 11:39 schrieb Christophe Demarey
> <[hidden email]>:
>
> Hi Norbert,
>
> I understand your point of view that others probably share.
> I also agree the situation is very bad: Inria took too much time to
> investigate the problem and now, renater also …
> The question is: would it be really better outside Inria? Maybe ... maybe
> not …
>
>
> Maybe that is the point. It is not a question if it works better outside
> because it will. The problem we have is so serious that it will be hard to
> find elsewhere. I can only repeat: It is not the download that fails which
> TCP wise means that exactly the thing is downloaded that inria offered. So
> something below the web server is broken and most probably they have a
> corrupt storage solution meaning only if you give it broken to the web
> server the broken thing can be transported in a sane manner.
>
>
> I’m not confident with the diagnostic.
> I never encountered this problem and we got no such feedback from people in
> the team. So it is really strange. Indeed, if CRC is worn it does not mean
> slowness but corrupted data … I do not see any reason that we do have CRC
> from outside Inria but not inside.
> If no solution is found today or tomorrow, we will probably see to host a
> solution outside Inria.
>
>
> If you are not confident about the diagnosis that's fine. But then it would
> be cool to come up with a better one and not giving that "works for me".
> That is not an explanation. Because your assumption would be that externally
> and internally you use the same system and transport ways. And that is most
> likely not true.
> Just try to find an explanation. Here we go:
>
> - If I download a zip file from files.pharo.org I open a TCP connection from
> my laptop to files.pharo.org
> - TCP has a checksum in the header that is used to check the integrity of
> each transferred package
> - So I can assume that this packet of the one connection was not tampered
> - If it counts for a single packet it counts for the whole file
> - With all my assumption I can state that I get exactly what the
> files.pharo.org is sending me
>
> So what are the possibilities it fails?
>
> - Either the stack beneath hands out the data to the TCP stack delivers
> corrupt data. That could be the filesystem hence my thought
> - Or there is no single connection between my laptop and files.pharo.org. In
> that case the data would flow over a proxy. But even if I have two
> connections and for each the integrity check counts then what I said for one
> connection counts for two and more as well
> - If we really decline beliving the filesystem fails (that is less likely if
> it works inside of inria)
> - Is there anything left? Yes, if the proxy modifies the data you can have
> data integrity on the first connection, then a program that deliberately
> changes the content and transfers that integer again over the second
> connection. So it can be broken on the other end.
>
> If we then look at the file we might recognize that it has the ending .zip
> and this is the file type that transfers most viruses over the net. So we
> even can imagine what kind of proxy that might be, let's call it a virus
> checker. That is what I would do in order to narrow the problem.
>
> Norbert
>

Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

Ben Coman
In reply to this post by demarey
On 23 November 2017 at 20:18, Christophe Demarey <[hidden email]> wrote:

> Le 23 nov. 2017 à 12:34, Norbert Hartl <[hidden email]> a écrit :
>
>
>
>> Am 23.11.2017 um 11:39 schrieb Christophe Demarey <[hidden email]>:
>>
>> Hi Norbert,
>>
>> I understand your point of view that others probably share.
>> I also agree the situation is very bad: Inria took too much time to investigate the problem and now, renater also …
>> The question is: would it be really better outside Inria? Maybe ... maybe not …
>
> Maybe that is the point. It is not a question if it works better outside because it will. The problem we have is so serious that it will be hard to find elsewhere. I can only repeat: It is not the download that fails which TCP wise means that exactly the thing is downloaded that inria offered. So something below the web server is broken and most probably they have a corrupt storage solution meaning only if you give it broken to the web server the broken thing can be transported in a sane manner.

I’m not confident with the diagnostic.
I never encountered this problem


The problem seems geo-specific.  So you may never personally observe the problem.  Of course that can make it near-impossible for  yourself to troubleshoot.
So consider the engineering principal...  "If you can't solve the problem, change the problem."   


cheers -ben
Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

demarey
In reply to this post by NorbertHartl
Hi Norbert,

Le 23 nov. 2017 à 14:49, Norbert Hartl <[hidden email]> a écrit :

If you are not confident about the diagnosis that's fine. But then it would be cool to come up with a better one

Sorry but I do not have a better diagnostic than yours today.

and not giving that "works for me ».

This sentence was just there to add the fact that it is not broken for all people / location. It adds information to help debugging.
It does not mean that, because it works for us, we do not investigate or try to find a solution.
Yesterday, I set up a new file server at Inria to replace files.pharo.org (at least to test if it is better) but I’m still waiting that the network unit open the port 80 to the world …
Today, I will probably set up a server outside Inria if I have the agreement to do it.

That is not an explanation. Because your assumption would be that externally and internally you use the same system and transport ways. And that is most likely not true. 
Just try to find an explanation. Here we go:

- If I download a zip file from files.pharo.org I open a TCP connection from my laptop to files.pharo.org
- TCP has a checksum in the header that is used to check the integrity of each transferred package
- So I can assume that this packet of the one connection was not tampered
- If it counts for a single packet it counts for the whole file
- With all my assumption I can state that I get exactly what the files.pharo.org is sending me

Agree

So what are the possibilities it fails?

- Either the stack beneath hands out the data to the TCP stack delivers corrupt data. That could be the filesystem hence my thought
- Or there is no single connection between my laptop and files.pharo.org. In that case the data would flow over a proxy. But even if I have two connections and for each the integrity check counts then what I said for one connection counts for two and more as well
- If we really decline beliving the filesystem fails (that is less likely if it works inside of inria)
- Is there anything left? Yes, if the proxy modifies the data you can have data integrity on the first connection, then a program that deliberately changes the content and transfers that integer again over the second connection. So it can be broken on the other end.

If we then look at the file we might recognize that it has the ending .zip and this is the file type that transfers most viruses over the net. So we even can imagine what kind of proxy that might be, let's call it a virus checker. That is what I would do in order to narrow the problem.

Thanks for trying to investigate.
At Inria, we have indeed some boxes dedicated to filter network traffic and eliminate spam, viruses. I’m not sure it is used for the outgoing traffic but it is a potential source of problem. I will check with network unit.
By the way, is there a way to have reproducible bad CRC?

Christophe
Reply | Threaded
Open this post in threaded view
|

Re: Travis build failing because of bad CRC

philippeback
In reply to this post by Ben Coman
Same issues for me.

I actually use an older thing that I know works instead of using zeroconf etc anymore.

Others may just walk away from Pharo silently.

Look in an age of Docker and multi gig successful downloads, a couple megs sjouldn't be that hard.

S3 or whatever works for millions of podcasts, BinTray also works nicely etc.

Shouldn't some consortium money channeled in there?

Phil


On Fri, Nov 24, 2017 at 3:24 AM, Ben Coman <[hidden email]> wrote:
On 23 November 2017 at 20:18, Christophe Demarey <[hidden email]> wrote:

> Le 23 nov. 2017 à 12:34, Norbert Hartl <[hidden email]> a écrit :
>
>
>
>> Am 23.11.2017 um 11:39 schrieb Christophe Demarey <[hidden email]>:
>>
>> Hi Norbert,
>>
>> I understand your point of view that others probably share.
>> I also agree the situation is very bad: Inria took too much time to investigate the problem and now, renater also …
>> The question is: would it be really better outside Inria? Maybe ... maybe not …
>
> Maybe that is the point. It is not a question if it works better outside because it will. The problem we have is so serious that it will be hard to find elsewhere. I can only repeat: It is not the download that fails which TCP wise means that exactly the thing is downloaded that inria offered. So something below the web server is broken and most probably they have a corrupt storage solution meaning only if you give it broken to the web server the broken thing can be transported in a sane manner.

I’m not confident with the diagnostic.
I never encountered this problem


The problem seems geo-specific.  So you may never personally observe the problem.  Of course that can make it near-impossible for  yourself to troubleshoot.
So consider the engineering principal...  "If you can't solve the problem, change the problem."   


cheers -ben

12