Hello, I've run into some odd behavior and wanted to check whether I might be missing something: I have downloaded a copy of the english wikipedia as an xml file and am hoping to (sax) parse it. However, I can't even seem to get pharo to recognize that the file exists.The version number is #40283 Thanks, |
Hi, If I understand correctly, the failure occurs while navigating in the "Items" presentation. I cannot reproduce this problem because I do not have enough disk space for such a large file :). But, could you do the following and let me know what the outcome is: 'path/to/your/large/file.xml' asFileReference humanReadableSize ? Cheers, Doru On Tue, Oct 14, 2014 at 2:27 AM, Evan Donahue <[hidden email]> wrote:
"Every thing has its own flow"
|
Hi, thanks for the reply. The response is the same: "MessageNotUnderstood: False>>humanReadableSIByteSize."Evan On Tue, Oct 14, 2014 at 12:24 AM, Tudor Girba <[hidden email]> wrote:
|
2014-10-14 6:38 GMT+02:00 Evan Donahue <[hidden email]>:
Which OS ? Can you check with other programs if this file is readable at all? Nicolai |
The OS is Arch Linux. I can read the file with less.Primitives lookupDirectory: encodedPath filename: encodedBasename On Tue, Oct 14, 2014 at 8:17 AM, Nicolai Hess <[hidden email]> wrote:
|
There is a bug report on mantis for squeaks unix vm. I think this applies to pharo too, although I don't know if this2014-10-14 17:43 GMT+02:00 Evan Donahue <[hidden email]>:
|
Hi Nicolai,
On Tue, Oct 14, 2014 at 11:26 AM, Nicolai Hess <[hidden email]> wrote:
yes, one must compile with -D_FILE_OFFSET_BITS=64. The Cog VMs are also built with -D_GNU_SOURCE. Here's a line from a Squeak file list on the current Cog VM: (2014.10.11 07:00:46 7,115,143,880) Formula1.2014.Round16.Russia.Qualifying.BBCOneHD.1080i.H264.English-wserhkzt.ts No 32-bit limit here.
best, Eliot
|
In reply to this post by Nicolai Hess
Rebuild pharo vm with D_FILE_OFFSET_BITS=64 option: Reading list directories with (very) large files working now. 2014-10-14 20:26 GMT+02:00 Nicolai Hess <[hidden email]>:
|
64 bits! That makes sense. I don't generally work with large files either. I have been getting my vm from get.pharo.org, so I will need to get the source to build the vm in the first place. A quick survey over the last few hours has revealed a multitude of vms, projects, platforms, repositories, and versions that I, in my pharo ignorance, cannot differentiate. Could someone please point me to the source I should be using to build the vm for the pharo40 image I have been pulling of get.pharo.org? Thank you,Evan On Tue, Oct 14, 2014 at 3:02 PM, Nicolai Hess <[hidden email]> wrote:
|
Easiest way: download the generated source from http://files.pharo.org/vm/src/vm-unix-sources/blessed/ 2014-10-14 22:35 GMT+02:00 Evan Donahue <[hidden email]>:
|
In reply to this post by Evan Donahue
2014-10-13 21:27 GMT-03:00 Evan Donahue <[hidden email]>:
Just for curiosity's sake, is there a reason why you don't query through a Sparql to dbpedia? Cheers, Hernán |
Certainly, thanks for the curiosity. I am processing natural language statistics over the entire wiki corpus, not querying for specific entries. The information and entries aren't important so much as the raw quantity of words. A simple stream through a file on disk is all I need. Evan On Wed, Oct 15, 2014 at 12:49 AM, Hernán Morales Durand <[hidden email]> wrote:
|
On Oct 15, 2014, at 1:02 AM, Evan Donahue <[hidden email]> wrote: > > I am processing natural language statistics over the entire wiki corpus, not querying for specific entries. The information and entries aren't important so much as the raw quantity of words. A simple stream through a file on disk is all I need. Maybe you can workaround the problem by writing a shell script that writes the file over a pipe or socket. Then on the that Pharo side, connect to the pipe/socket, and read the data stream. |
In reply to this post by Evan Donahue
Cool project! Let me know how do you go. Cheers,Hernán 2014-10-15 2:02 GMT-03:00 Evan Donahue <[hidden email]>:
|
Free forum by Nabble | Edit this page |