Hi Stef,
On 12 November 2017 at 14:47, Stephane Ducasse <[hidden email]> wrote: > exampleNavigation > | chrome page logger | > logger := InMemoryLogger new. > logger start. > chrome := GoogleChrome new > debugOn; > debugSession; > open; > yourself. > page := chrome tabPages first. > page enablePage. > page enableDOM. > page navigateTo: 'http://pharo.org'. > page getDocument. > page getMissingChildren. > page updateTitle. > logger stop. > ^{ chrome. page. logger. } > > but in fact I realised that I would like to a simple doc :) > > > On Sun, Nov 12, 2017 at 2:44 PM, Stephane Ducasse > <[hidden email]> wrote: >> Hi alistair >> >> this is cool. >> Do you have one little example so that we can see how we can use it? >> >> Stef Fair enough :-) I'll try and extend the readme to include some basic documentation. Cheers, Alistair >> On Sat, Nov 11, 2017 at 4:38 PM, Alistair Grant <[hidden email]> wrote: >>> On 9 November 2017 at 00:00, Kjell Godo <[hidden email]> wrote: >>>> i like to collect some newspaper comics from an online newspaper >>>> but it takes really long to do it by hand by hand >>>> i tried Soup but i didn’t get anywhere >>>> the pictures were hidden behind a script or something >>>> is there anything to do about that? >>> >>> Most of the web pages I want to scrape use javascript to construct the >>> DOM, which makes Soup. XMLHTMLParser, etc. useless. >>> >>> I've extended Torsten's Pharo-Chrome library and use that to navigate >>> the DOM in a way similar to Soup: >>> >>> https://github.com/akgrant43/Pharo-Chrome >>> >>> This gets around the issue with javascript since it waits for the >>> browser to load the page, run the javascript and construct the DOM. >>> >>> HTH, >>> Alistair >>> >>> >>> >>>> i don’t want to collect them all >>>> i have the XPath .pdf but i haven’t read it yet >>>> >>>> these browsers seem to gobble up memory >>>> and while open they just keep getting bigger till the OS session crash >>>> might there be a browser that is more minimal? >>>> >>>> Vivaldi seems better at not bloating up RAM >>> > |
Tx and one day we can turn it into another little booklet :)
Stef On Sun, Nov 12, 2017 at 3:04 PM, Alistair Grant <[hidden email]> wrote: > Hi Stef, > > On 12 November 2017 at 14:47, Stephane Ducasse <[hidden email]> wrote: >> exampleNavigation >> | chrome page logger | >> logger := InMemoryLogger new. >> logger start. >> chrome := GoogleChrome new >> debugOn; >> debugSession; >> open; >> yourself. >> page := chrome tabPages first. >> page enablePage. >> page enableDOM. >> page navigateTo: 'http://pharo.org'. >> page getDocument. >> page getMissingChildren. >> page updateTitle. >> logger stop. >> ^{ chrome. page. logger. } >> >> but in fact I realised that I would like to a simple doc :) >> >> >> On Sun, Nov 12, 2017 at 2:44 PM, Stephane Ducasse >> <[hidden email]> wrote: >>> Hi alistair >>> >>> this is cool. >>> Do you have one little example so that we can see how we can use it? >>> >>> Stef > > Fair enough :-) > > I'll try and extend the readme to include some basic documentation. > > Cheers, > Alistair > > > >>> On Sat, Nov 11, 2017 at 4:38 PM, Alistair Grant <[hidden email]> wrote: >>>> On 9 November 2017 at 00:00, Kjell Godo <[hidden email]> wrote: >>>>> i like to collect some newspaper comics from an online newspaper >>>>> but it takes really long to do it by hand by hand >>>>> i tried Soup but i didn’t get anywhere >>>>> the pictures were hidden behind a script or something >>>>> is there anything to do about that? >>>> >>>> Most of the web pages I want to scrape use javascript to construct the >>>> DOM, which makes Soup. XMLHTMLParser, etc. useless. >>>> >>>> I've extended Torsten's Pharo-Chrome library and use that to navigate >>>> the DOM in a way similar to Soup: >>>> >>>> https://github.com/akgrant43/Pharo-Chrome >>>> >>>> This gets around the issue with javascript since it waits for the >>>> browser to load the page, run the javascript and construct the DOM. >>>> >>>> HTH, >>>> Alistair >>>> >>>> >>>> >>>>> i don’t want to collect them all >>>>> i have the XPath .pdf but i haven’t read it yet >>>>> >>>>> these browsers seem to gobble up memory >>>>> and while open they just keep getting bigger till the OS session crash >>>>> might there be a browser that is more minimal? >>>>> >>>>> Vivaldi seems better at not bloating up RAM >>>> >> > |
In reply to this post by alistairgrant
Hi Sean,
Thanks for your feedback! (responses below) On 12 November 2017 at 18:11, Sean P. DeNigris <[hidden email]> wrote: > Alistair Grant wrote >> https://github.com/akgrant43/Pharo-Chrome > > Wow, that was a wild ride! Sorry about that. > Lessons learned along the way: > 1. On a Mac, to use the snazzy `chrome` terminal command referenced all over > the place in the docs, you must first `alias chrome="/Applications/Google\ > Chrome.app/Contents/MacOS/Google\ Chrome"` I'm an Ubuntu Linux user, however if you look at OSXChromePlatform class>>defaultExecutableLocation you can see that is where it should be looking for the exe, so the alias shouldn't really be necessary. Torsten wrote this, so maybe has more insight. > 2. Chrome must be started with certain flags: `chrome > --remote-debugging-port=9222 --disable-gpu` (not sure if the last flag is > needed, but `#get:` seemed to hang before using; reference > https://developers.google.com/web/updates/2017/04/headless-chrome) I've been using this without headless mode. I'll add a headless flag that also disables the gpu. > 3. Beacon has renamed InMemoryLogger to MemoryLogger > 4. I guess Beacon has renamed `#log` to `#emit` Sorry about that. I didn't realise that the Pharo-Chrome baseline is loading Beacon stable while my install script upgrades it to #development. #development is more recent, so I'll update the baseline. > 5. I had to comment out `chromeProcess sigterm.` because `chromeProcess` was > nil and also #sigterm seemed not to be defined anywhere in the image. I'm > not sure what the issue is there. chromeProcess is set in GoogleChrome>>openURL:. Can you give me a small example that demonstrates the problem? #sigterm is implemented by OSSUnixSubprocess, which is what I ultimately use to launch the Chrome process on Ubuntu. But... this will be broken on Mac at the moment because the current method of launching chrome doesn't keep track of the process, so doesn't support #sigterm. Do you know if OSSUnixSubprocess works on Mac? If it does, I can update the code (but not test it :-(). > Pull request issued for #3 & #4. Once I update the baseline this shouldn't be required. > Also, I'm not sure what platforms you > support, but you may want to tag the example methods with <gtExample> or > similar so that they are runnable from the browser and open an inspector if > there is an interesting return value. Good idea, I'll do this. I'm also making a few other changes: 1. Add an #extractTables method that searches through the page and returns an array of rows for each table it finds in the page (something that can easily be loaded in to DataFrame using #fromRows:, but I don't want to make Pharo-Chrome dependent on DataFrame at the moment). Most of the time I use Pharo-Chrome it is extracting data from tables. 2. I don't know of any reliable way to tell when a page has loaded since there can always be javascript that periodically updates the page. At the moment it waits until the page hasn't changed for a configurable amount of time. I'm planning to add a check for specific content to determine if the page is considered loaded. 3. Add some documentation to the readme :-) > ----- > Cheers, > Sean I'll let you know when I have a new version available (hopefully in the next few days). Thanks again, Alistair |
I've committed some fixes to the development branch:
1. MacOS hopefully works now (I don't have access to the platform, so can't test it). 2. The development version of Beacon is loaded (which is required for the InMemoryLogger). 3. The README is a tiny bit better. 4. Added #extractTables. As an example of how historical stock market data can be extracted, the following retrieves data for the Australian S&P200 index from yahoo: | rootNode tables historicalData dataFrame | rootNode := GoogleChrome get: 'https://finance.yahoo.com/quote/%5EAXJO/history?p=%5EAXJO'. tables := rootNode extractTables. historicalData := (tables sorted: #size ascending) last. dataFrame := DataFrame fromRows: (historicalData select: [ :each | each size = 7 ]). dataFrame asStringTable. " | 1 2 3 4 5 6 7 -----+----------------------------------------------------------------------------- 1 | Date Open High Low Close* Adj Close** Volume 2 | Nov 14, 2017 6,021.80 6,021.80 5,957.10 5,966.00 5,966.00 - 3 | Nov 13, 2017 6,029.40 6,029.40 6,010.70 6,021.80 6,021.80 - 4 | Nov 10, 2017 6,049.40 6,049.40 6,020.70 6,029.40 6,029.40 - etc. " To load the development version on MacOS or Linux in a 32 bit image: "Assuming you don't have OSProcess loaded:" Metacello new configuration: 'OSSubprocess'; repository: 'github://marianopeck/OSSubprocess:master/repository'; version: #stable; load. Metacello new baseline: 'Chrome'; repository: 'github://akgrant43/Pharo-Chrome:development/repository'; load. Cheers, Alistair On 12 November 2017 at 20:09, Alistair Grant <[hidden email]> wrote: > Hi Sean, > > Thanks for your feedback! (responses below) > > > On 12 November 2017 at 18:11, Sean P. DeNigris <[hidden email]> wrote: >> Alistair Grant wrote >>> https://github.com/akgrant43/Pharo-Chrome >> >> Wow, that was a wild ride! > > Sorry about that. > > >> Lessons learned along the way: >> 1. On a Mac, to use the snazzy `chrome` terminal command referenced all over >> the place in the docs, you must first `alias chrome="/Applications/Google\ >> Chrome.app/Contents/MacOS/Google\ Chrome"` > > I'm an Ubuntu Linux user, however if you look at OSXChromePlatform > class>>defaultExecutableLocation you can see that is where it should > be looking for the exe, so the alias shouldn't really be necessary. > Torsten wrote this, so maybe has more insight. > > >> 2. Chrome must be started with certain flags: `chrome >> --remote-debugging-port=9222 --disable-gpu` (not sure if the last flag is >> needed, but `#get:` seemed to hang before using; reference >> https://developers.google.com/web/updates/2017/04/headless-chrome) > > I've been using this without headless mode. I'll add a headless flag > that also disables the gpu. > > > >> 3. Beacon has renamed InMemoryLogger to MemoryLogger >> 4. I guess Beacon has renamed `#log` to `#emit` > > Sorry about that. I didn't realise that the Pharo-Chrome baseline is > loading Beacon stable while my install script upgrades it to > #development. #development is more recent, so I'll update the > baseline. > > > >> 5. I had to comment out `chromeProcess sigterm.` because `chromeProcess` was >> nil and also #sigterm seemed not to be defined anywhere in the image. I'm >> not sure what the issue is there. > > chromeProcess is set in GoogleChrome>>openURL:. Can you give me a > small example that demonstrates the problem? > > #sigterm is implemented by OSSUnixSubprocess, which is what I > ultimately use to launch the Chrome process on Ubuntu. > > But... this will be broken on Mac at the moment because the current > method of launching chrome doesn't keep track of the process, so > doesn't support #sigterm. Do you know if OSSUnixSubprocess works on > Mac? If it does, I can update the code (but not test it :-(). > > >> Pull request issued for #3 & #4. > > Once I update the baseline this shouldn't be required. > > >> Also, I'm not sure what platforms you >> support, but you may want to tag the example methods with <gtExample> or >> similar so that they are runnable from the browser and open an inspector if >> there is an interesting return value. > > Good idea, I'll do this. > > I'm also making a few other changes: > > 1. Add an #extractTables method that searches through the page and > returns an array of rows for each table it finds in the page > (something that can easily be loaded in to DataFrame using #fromRows:, > but I don't want to make Pharo-Chrome dependent on DataFrame at the > moment). Most of the time I use Pharo-Chrome it is extracting data > from tables. > > 2. I don't know of any reliable way to tell when a page has loaded > since there can always be javascript that periodically updates the > page. At the moment it waits until the page hasn't changed for a > configurable amount of time. I'm planning to add a check for specific > content to determine if the page is considered loaded. > > 3. Add some documentation to the readme :-) > > > >> ----- >> Cheers, >> Sean > > I'll let you know when I have a new version available (hopefully in > the next few days). > > > Thanks again, > Alistair |
Administrator
|
Alistair Grant wrote
> I've committed some fixes to the development branch: Thanks! I tried your example, but apparently the OSXProcess class, which is referenced in openChromeWith: is missing. Also, no class in the image seems to define #createProcess:, which is sent to OSXProcess there ----- Cheers, Sean -- Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
Cheers,
Sean |
Hi Sean,
On 14 November 2017 at 19:06, Sean P. DeNigris <[hidden email]> wrote: > Alistair Grant wrote >> I've committed some fixes to the development branch: > > Thanks! > > I tried your example, but apparently the OSXProcess class, which is > referenced in openChromeWith: is missing. Also, no class in the image seems > to define #createProcess:, which is sent to OSXProcess there This looks like you are using an old (cached?) version. Maybe try "Pull incoming commits" from Iceberg? You should have (minus the broken formatting from pasting): OSXChromePlatform>>openChromeWith: arguments | executableLocation process | executableLocation := self defaultExecutableLocation copyReplaceAll: ' ' with: '\ '. process := AKGOSProcess command: executableLocation arguments: arguments. process run. ^process HTH, Alistair |
On 14 November 2017 at 19:13, Alistair Grant <[hidden email]> wrote:
> Hi Sean, > > On 14 November 2017 at 19:06, Sean P. DeNigris <[hidden email]> wrote: >> Alistair Grant wrote >>> I've committed some fixes to the development branch: >> >> Thanks! >> >> I tried your example, but apparently the OSXProcess class, which is >> referenced in openChromeWith: is missing. Also, no class in the image seems >> to define #createProcess:, which is sent to OSXProcess there > > This looks like you are using an old (cached?) version. Maybe try > "Pull incoming commits" from Iceberg? > > You should have (minus the broken formatting from pasting): > > > OSXChromePlatform>>openChromeWith: arguments > > | executableLocation process | > executableLocation := self defaultExecutableLocation copyReplaceAll: ' > ' with: '\ '. > process := AKGOSProcess command: executableLocation arguments: arguments. > process run. > ^process P.S. Don't forget this is on the development branch. Cheers, Alistair |
Hi Alistar,
I have tried to run the examples, but seems that installation doesn't include all needed package. At the beginning I installed OSUnix and then OSLinuxUbuntu. None of them seems to include "AKGOSProcess", so the "GoogleChrome get: 'http://pharo.org'" example raises: "#command:arguments: was sent to nil". What package provides the proper installation for a 64 bits Manjaro Linux including the dependencies? Thanks, Offray On 14/11/17 13:14, Alistair Grant wrote: > On 14 November 2017 at 19:13, Alistair Grant <[hidden email]> wrote: >> Hi Sean, >> >> On 14 November 2017 at 19:06, Sean P. DeNigris <[hidden email]> wrote: >>> Alistair Grant wrote >>>> I've committed some fixes to the development branch: >>> Thanks! >>> >>> I tried your example, but apparently the OSXProcess class, which is >>> referenced in openChromeWith: is missing. Also, no class in the image seems >>> to define #createProcess:, which is sent to OSXProcess there >> This looks like you are using an old (cached?) version. Maybe try >> "Pull incoming commits" from Iceberg? >> >> You should have (minus the broken formatting from pasting): >> >> >> OSXChromePlatform>>openChromeWith: arguments >> >> | executableLocation process | >> executableLocation := self defaultExecutableLocation copyReplaceAll: ' >> ' with: '\ '. >> process := AKGOSProcess command: executableLocation arguments: arguments. >> process run. >> ^process > > P.S. Don't forget this is on the development branch. > > Cheers, > Alistair > > |
OK, the development branch solve this, as shown in
http://ws.stfx.eu/O6J4CJ1FZF89. Now I'm getting an unresponsive image until I close Chrome, but I think that was talked in the thread. I'll revise. Cheers, Offray On 14/11/17 17:28, Offray Vladimir Luna Cárdenas wrote: > Hi Alistar, > > I have tried to run the examples, but seems that installation doesn't > include all needed package. At the beginning I installed OSUnix and then > OSLinuxUbuntu. None of them seems to include "AKGOSProcess", so the > "GoogleChrome get: 'http://pharo.org'" example raises: > "#command:arguments: was sent to nil". What package provides the proper > installation for a 64 bits Manjaro Linux including the dependencies? > > Thanks, > > Offray > > > On 14/11/17 13:14, Alistair Grant wrote: >> On 14 November 2017 at 19:13, Alistair Grant <[hidden email]> wrote: >>> Hi Sean, >>> >>> On 14 November 2017 at 19:06, Sean P. DeNigris <[hidden email]> wrote: >>>> Alistair Grant wrote >>>>> I've committed some fixes to the development branch: >>>> Thanks! >>>> >>>> I tried your example, but apparently the OSXProcess class, which is >>>> referenced in openChromeWith: is missing. Also, no class in the image seems >>>> to define #createProcess:, which is sent to OSXProcess there >>> This looks like you are using an old (cached?) version. Maybe try >>> "Pull incoming commits" from Iceberg? >>> >>> You should have (minus the broken formatting from pasting): >>> >>> >>> OSXChromePlatform>>openChromeWith: arguments >>> >>> | executableLocation process | >>> executableLocation := self defaultExecutableLocation copyReplaceAll: ' >>> ' with: '\ '. >>> process := AKGOSProcess command: executableLocation arguments: arguments. >>> process run. >>> ^process >> P.S. Don't forget this is on the development branch. >> >> Cheers, >> Alistair >> >> > > > |
In reply to this post by alistairgrant
Hi Alistair,
The example is not working for me. When I run it, a chrome session is open but nothing happens there, except that my image gets frozen until I close chrome and then I get this message: "ConnectionTimedOut: Cannot connect to 127.0.0.1:9222". What is the expected behavior? PharoChrome expects the user to have a Google account or be logged in by default to work (that would be a shame for those of us that don't a Google account and still value our privacy). Thanks, Offray On 14/11/17 11:26, Alistair Grant wrote: > I've committed some fixes to the development branch: > > 1. MacOS hopefully works now (I don't have access to the platform, so > can't test it). > 2. The development version of Beacon is loaded (which is required for > the InMemoryLogger). > 3. The README is a tiny bit better. > 4. Added #extractTables. > > As an example of how historical stock market data can be extracted, > the following retrieves data for the Australian S&P200 index from > yahoo: > > > | rootNode tables historicalData dataFrame | > > rootNode := GoogleChrome get: > 'https://finance.yahoo.com/quote/%5EAXJO/history?p=%5EAXJO'. > tables := rootNode extractTables. > historicalData := (tables sorted: #size ascending) last. > dataFrame := DataFrame fromRows: (historicalData select: [ :each | > each size = 7 ]). > dataFrame asStringTable. > > " > | 1 2 3 4 5 6 > 7 > -----+----------------------------------------------------------------------------- > 1 | Date Open High Low Close* Adj > Close** Volume > 2 | Nov 14, 2017 6,021.80 6,021.80 5,957.10 5,966.00 5,966.00 > - > 3 | Nov 13, 2017 6,029.40 6,029.40 6,010.70 6,021.80 6,021.80 > - > 4 | Nov 10, 2017 6,049.40 6,049.40 6,020.70 6,029.40 6,029.40 > - > etc. > " > > > To load the development version on MacOS or Linux in a 32 bit image: > > "Assuming you don't have OSProcess loaded:" > Metacello new > configuration: 'OSSubprocess'; > repository: 'github://marianopeck/OSSubprocess:master/repository'; > version: #stable; > load. > > Metacello new > baseline: 'Chrome'; > repository: 'github://akgrant43/Pharo-Chrome:development/repository'; > load. > > > Cheers, > Alistair > > > On 12 November 2017 at 20:09, Alistair Grant <[hidden email]> wrote: >> Hi Sean, >> >> Thanks for your feedback! (responses below) >> >> >> On 12 November 2017 at 18:11, Sean P. DeNigris <[hidden email]> wrote: >>> Alistair Grant wrote >>>> https://github.com/akgrant43/Pharo-Chrome >>> Wow, that was a wild ride! >> Sorry about that. >> >> >>> Lessons learned along the way: >>> 1. On a Mac, to use the snazzy `chrome` terminal command referenced all over >>> the place in the docs, you must first `alias chrome="/Applications/Google\ >>> Chrome.app/Contents/MacOS/Google\ Chrome"` >> I'm an Ubuntu Linux user, however if you look at OSXChromePlatform >> class>>defaultExecutableLocation you can see that is where it should >> be looking for the exe, so the alias shouldn't really be necessary. >> Torsten wrote this, so maybe has more insight. >> >> >>> 2. Chrome must be started with certain flags: `chrome >>> --remote-debugging-port=9222 --disable-gpu` (not sure if the last flag is >>> needed, but `#get:` seemed to hang before using; reference >>> https://developers.google.com/web/updates/2017/04/headless-chrome) >> I've been using this without headless mode. I'll add a headless flag >> that also disables the gpu. >> >> >> >>> 3. Beacon has renamed InMemoryLogger to MemoryLogger >>> 4. I guess Beacon has renamed `#log` to `#emit` >> Sorry about that. I didn't realise that the Pharo-Chrome baseline is >> loading Beacon stable while my install script upgrades it to >> #development. #development is more recent, so I'll update the >> baseline. >> >> >> >>> 5. I had to comment out `chromeProcess sigterm.` because `chromeProcess` was >>> nil and also #sigterm seemed not to be defined anywhere in the image. I'm >>> not sure what the issue is there. >> chromeProcess is set in GoogleChrome>>openURL:. Can you give me a >> small example that demonstrates the problem? >> >> #sigterm is implemented by OSSUnixSubprocess, which is what I >> ultimately use to launch the Chrome process on Ubuntu. >> >> But... this will be broken on Mac at the moment because the current >> method of launching chrome doesn't keep track of the process, so >> doesn't support #sigterm. Do you know if OSSUnixSubprocess works on >> Mac? If it does, I can update the code (but not test it :-(). >> >> >>> Pull request issued for #3 & #4. >> Once I update the baseline this shouldn't be required. >> >> >>> Also, I'm not sure what platforms you >>> support, but you may want to tag the example methods with <gtExample> or >>> similar so that they are runnable from the browser and open an inspector if >>> there is an interesting return value. >> Good idea, I'll do this. >> >> I'm also making a few other changes: >> >> 1. Add an #extractTables method that searches through the page and >> returns an array of rows for each table it finds in the page >> (something that can easily be loaded in to DataFrame using #fromRows:, >> but I don't want to make Pharo-Chrome dependent on DataFrame at the >> moment). Most of the time I use Pharo-Chrome it is extracting data >> from tables. >> >> 2. I don't know of any reliable way to tell when a page has loaded >> since there can always be javascript that periodically updates the >> page. At the moment it waits until the page hasn't changed for a >> configurable amount of time. I'm planning to add a check for specific >> content to determine if the page is considered loaded. >> >> 3. Add some documentation to the readme :-) >> >> >> >>> ----- >>> Cheers, >>> Sean >> I'll let you know when I have a new version available (hopefully in >> the next few days). >> >> >> Thanks again, >> Alistair > |
The last was a question :-P Is PharoChrome expecting to be logged in to
some Google account to work? Cheers, Offray On 14/11/17 18:18, Offray Vladimir Luna Cárdenas wrote: > Hi Alistair, > > The example is not working for me. When I run it, a chrome session is > open but nothing happens there, except that my image gets frozen until I > close chrome and then I get this message: "ConnectionTimedOut: Cannot > connect to 127.0.0.1:9222". What is the expected behavior? PharoChrome > expects the user to have a Google account or be logged in by default to > work (that would be a shame for those of us that don't a Google account > and still value our privacy). > > Thanks, > > Offray > > > On 14/11/17 11:26, Alistair Grant wrote: >> I've committed some fixes to the development branch: >> >> 1. MacOS hopefully works now (I don't have access to the platform, so >> can't test it). >> 2. The development version of Beacon is loaded (which is required for >> the InMemoryLogger). >> 3. The README is a tiny bit better. >> 4. Added #extractTables. >> >> As an example of how historical stock market data can be extracted, >> the following retrieves data for the Australian S&P200 index from >> yahoo: >> >> >> | rootNode tables historicalData dataFrame | >> >> rootNode := GoogleChrome get: >> 'https://finance.yahoo.com/quote/%5EAXJO/history?p=%5EAXJO'. >> tables := rootNode extractTables. >> historicalData := (tables sorted: #size ascending) last. >> dataFrame := DataFrame fromRows: (historicalData select: [ :each | >> each size = 7 ]). >> dataFrame asStringTable. >> >> " >> | 1 2 3 4 5 6 >> 7 >> -----+----------------------------------------------------------------------------- >> 1 | Date Open High Low Close* Adj >> Close** Volume >> 2 | Nov 14, 2017 6,021.80 6,021.80 5,957.10 5,966.00 5,966.00 >> - >> 3 | Nov 13, 2017 6,029.40 6,029.40 6,010.70 6,021.80 6,021.80 >> - >> 4 | Nov 10, 2017 6,049.40 6,049.40 6,020.70 6,029.40 6,029.40 >> - >> etc. >> " >> >> >> To load the development version on MacOS or Linux in a 32 bit image: >> >> "Assuming you don't have OSProcess loaded:" >> Metacello new >> configuration: 'OSSubprocess'; >> repository: 'github://marianopeck/OSSubprocess:master/repository'; >> version: #stable; >> load. >> >> Metacello new >> baseline: 'Chrome'; >> repository: 'github://akgrant43/Pharo-Chrome:development/repository'; >> load. >> >> >> Cheers, >> Alistair >> >> >> On 12 November 2017 at 20:09, Alistair Grant <[hidden email]> wrote: >>> Hi Sean, >>> >>> Thanks for your feedback! (responses below) >>> >>> >>> On 12 November 2017 at 18:11, Sean P. DeNigris <[hidden email]> wrote: >>>> Alistair Grant wrote >>>>> https://github.com/akgrant43/Pharo-Chrome >>>> Wow, that was a wild ride! >>> Sorry about that. >>> >>> >>>> Lessons learned along the way: >>>> 1. On a Mac, to use the snazzy `chrome` terminal command referenced all over >>>> the place in the docs, you must first `alias chrome="/Applications/Google\ >>>> Chrome.app/Contents/MacOS/Google\ Chrome"` >>> I'm an Ubuntu Linux user, however if you look at OSXChromePlatform >>> class>>defaultExecutableLocation you can see that is where it should >>> be looking for the exe, so the alias shouldn't really be necessary. >>> Torsten wrote this, so maybe has more insight. >>> >>> >>>> 2. Chrome must be started with certain flags: `chrome >>>> --remote-debugging-port=9222 --disable-gpu` (not sure if the last flag is >>>> needed, but `#get:` seemed to hang before using; reference >>>> https://developers.google.com/web/updates/2017/04/headless-chrome) >>> I've been using this without headless mode. I'll add a headless flag >>> that also disables the gpu. >>> >>> >>> >>>> 3. Beacon has renamed InMemoryLogger to MemoryLogger >>>> 4. I guess Beacon has renamed `#log` to `#emit` >>> Sorry about that. I didn't realise that the Pharo-Chrome baseline is >>> loading Beacon stable while my install script upgrades it to >>> #development. #development is more recent, so I'll update the >>> baseline. >>> >>> >>> >>>> 5. I had to comment out `chromeProcess sigterm.` because `chromeProcess` was >>>> nil and also #sigterm seemed not to be defined anywhere in the image. I'm >>>> not sure what the issue is there. >>> chromeProcess is set in GoogleChrome>>openURL:. Can you give me a >>> small example that demonstrates the problem? >>> >>> #sigterm is implemented by OSSUnixSubprocess, which is what I >>> ultimately use to launch the Chrome process on Ubuntu. >>> >>> But... this will be broken on Mac at the moment because the current >>> method of launching chrome doesn't keep track of the process, so >>> doesn't support #sigterm. Do you know if OSSUnixSubprocess works on >>> Mac? If it does, I can update the code (but not test it :-(). >>> >>> >>>> Pull request issued for #3 & #4. >>> Once I update the baseline this shouldn't be required. >>> >>> >>>> Also, I'm not sure what platforms you >>>> support, but you may want to tag the example methods with <gtExample> or >>>> similar so that they are runnable from the browser and open an inspector if >>>> there is an interesting return value. >>> Good idea, I'll do this. >>> >>> I'm also making a few other changes: >>> >>> 1. Add an #extractTables method that searches through the page and >>> returns an array of rows for each table it finds in the page >>> (something that can easily be loaded in to DataFrame using #fromRows:, >>> but I don't want to make Pharo-Chrome dependent on DataFrame at the >>> moment). Most of the time I use Pharo-Chrome it is extracting data >>> from tables. >>> >>> 2. I don't know of any reliable way to tell when a page has loaded >>> since there can always be javascript that periodically updates the >>> page. At the moment it waits until the page hasn't changed for a >>> configurable amount of time. I'm planning to add a check for specific >>> content to determine if the page is considered loaded. >>> >>> 3. Add some documentation to the readme :-) >>> >>> >>> >>>> ----- >>>> Cheers, >>>> Sean >>> I'll let you know when I have a new version available (hopefully in >>> the next few days). >>> >>> >>> Thanks again, >>> Alistair > > > |
Administrator
|
In reply to this post by alistairgrant
Alistair Grant wrote
> This looks like you are using an old (cached?) version. Ugh, yes. I just deleted the local clone and let Iceberg reclone. Now when I tried: `GoogleChrome get: 'https://finance.yahoo.com/quote/%5EAXJO/history?p=%5EAXJO'` I got: Error: Error: posix_spawn(), code: 2, description: No such file or directory Even though pasting the command into Terminal successfully launched Chrome. BTW I had to insert a leading / to into the executable location. ----- Cheers, Sean -- Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
Cheers,
Sean |
If this is a problem with OSSubprocess I am happy to help it debug it, but please share with me the exact steps to reproduce it and which code to look at. And which OS and which Pharo. And it should be 32 bits (OSSubprocess doesn't work on 64 yet)
Thanks, On Tue, Nov 14, 2017 at 9:47 PM, Sean P. DeNigris <[hidden email]> wrote: Alistair Grant wrote |
In reply to this post by Offray Vladimir Luna Cárdenas-2
Hi Offray,
On 15 November 2017 at 00:18, Offray Vladimir Luna Cárdenas <[hidden email]> wrote: > Hi Alistair, > > The example is not working for me. When I run it, a chrome session is > open but nothing happens there, except that my image gets frozen until I > close chrome and then I get this message: "ConnectionTimedOut: Cannot > connect to 127.0.0.1:9222". What is the expected behavior? I'm not sure why this is happening. Chrome only allows one instance per profile to be running, however the example should be creating its own profile (which is what is done when GoogleChrome>>debugSession is sent). Can you check that the profile directory is being created: /tmp/pharo/GoogleChrome/debugSession/ Also, you should have several processes running, similar to: alistair 11001 6953 3 07:34 pts/19 00:00:57 /opt/google/chrome/chrome --user-data-dir=/tmp/pharo/GoogleChrome/debugSession --remote-debugging-port=9222 alistair 11005 11001 0 07:34 pts/19 00:00:00 /opt/google/chrome/chrome --type=zygote --enable-crash-reporter=9472c7b5-b817-49a9-a2df-266ef87a1707,unknown --user-data-dir=/tmp/pharo/GoogleChrome/debugSession alistair 11009 11005 0 07:34 pts/19 00:00:00 /opt/google/chrome/chrome --type=zygote --enable-crash-reporter=9472c7b5-b817-49a9-a2df-266ef87a1707,unknown --user-data-dir=/tmp/pharo/GoogleChrome/debugSession alistair 11193 11009 6 07:35 pts/19 00:01:51 /opt/google/chrome/chrome --type=renderer --field-trial-handle=13786453131923986905,2801831905294320914,131072 --service-pipe-token=4E2DA31A2AA7D6D8585A99928CABF01B --lang=en-GB --enable-crash-reporter=9472c7b5-b817-49a9-a2df-266ef87a1707,unknown --user-data-dir=/tmp/pharo/GoogleChrome/debugSession --enable-offline-auto-reload --enable-offline-auto-reload-visible-only --enable-pinch --num-raster-threads=2 --enable-main-frame-before-activation --content-image-texture-target=(lots of numbers removed) You can see that the first process has the separate profile (--user-data-dir) and remote debugging enabled. The last process listed above is the one rendering the page (I ran GoogleChrome class>>exampleNavigation to get this). Maybe as a last resort you could try ensuring that no other instances of chrome are running before you try the example. > PharoChrome > expects the user to have a Google account or be logged in by default to > work (that would be a shame for those of us that don't a Google account > and still value our privacy). No, by default it won't be logged in (since it is creating a separate profile). Thanks, Alistair > Thanks, > > Offray |
In reply to this post by Sean P. DeNigris
Hi Sean,
Sorry (and to Offray) for the trouble, but thanks for persevering. On 15 November 2017 at 01:47, Sean P. DeNigris <[hidden email]> wrote: > Alistair Grant wrote >> This looks like you are using an old (cached?) version. > > Ugh, yes. I just deleted the local clone and let Iceberg reclone. > > Now when I tried: > `GoogleChrome get: > 'https://finance.yahoo.com/quote/%5EAXJO/history?p=%5EAXJO'` > I got: > Error: Error: posix_spawn(), code: 2, description: No such file or > directory > Even though pasting the command into Terminal successfully launched Chrome. > > BTW I had to insert a leading / to into the executable location. Would you mind setting a breakpoint in AKGOSProcess>>command:arguments:, printing the command and arguments and making sure that the --user-data-dir exists? (I'm not familiar with MacOS and am wondering if maybe there is some sandboxing causing trouble). Also, as Mariano requested, can you confirm that it is MacOS, which version of Pharo, and a 32 bit VM? Thanks, Alistair |
Administrator
|
Alistair Grant wrote
> Sorry (and to Offray) for the trouble, but thanks for persevering. Not at all! Thanks for updating the library :) Alistair Grant wrote > Would you mind setting a breakpoint in > AKGOSProcess>>command:arguments:, printing the command and arguments > and making sure that the --user-data-dir exists? Sure: command = '/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome' "(after I added the leading slash)". args = #('--user-data-dir=/tmp/pharo/GoogleChrome/debugSession' '--remote-debugging-port=9222') Alistair Grant wrote > and making sure that the --user-data-dir exists? It does. Also, as a sanity check, the following works: OSSUnixSubprocess new command: 'open'; arguments: { 'http://www.pharo.org' }; run As well as from the Terminal: $ /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --user-data-dir=/tmp/pharo/GoogleChrome/debugSession --remote-debugging-port=9222 Alistair Grant wrote > Also, as Mariano requested, can you confirm that it is MacOS, which > version of Pharo, and a 32 bit VM? I think I'm on 32 bit. I created a fresh 6.1 from Launcher. How can one tell for sure if one is in a 32 vs 64 bit image? Latest update: #60520 Operating System/Hardware ------------------------- Mac OS 1013.1 intel Virtual Machine --------------- /Users/sean/Documents/Pharo/vms/61-x86/Pharo.app/Contents/MacOS/Pharo CoInterpreter VMMaker.oscog-eem.2254 uuid: 4f2c2cce-f4a2-469a-93f1-97ed941df0ad Jul 20 2017 StackToRegisterMappingCogit VMMaker.oscog-eem.2252 uuid: 2f3e9b0e-ecd3-4adf-b092-cce2e2587a5c Jul 20 2017 VM: 201707201942 https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ Date: Thu Jul 20 12:42:21 2017 -0700 $ Plugins: 201707201942 https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ Mac OS X built on Jul 20 2017 21:45:23 UTC Compiler: 4.2.1 Compatible Apple LLVM 6.1.0 (clang-602.0.53) VMMaker versionString VM: 201707201942 https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ Date: Thu Jul 20 12:42:21 2017 -0700 $ Plugins: 201707201942 https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ CoInterpreter VMMaker.oscog-eem.2254 uuid: 4f2c2cce-f4a2-469a-93f1-97ed941df0ad Jul 20 2017 StackToRegisterMappingCogit VMMaker.oscog-eem.2252 uuid: 2f3e9b0e-ecd3-4adf-b092-cce2e2587a5c Jul 20 2017 ----- Cheers, Sean -- Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
Cheers,
Sean |
Hi Sean,
On 15 November 2017 at 10:23, Sean P. DeNigris <[hidden email]> wrote: > Alistair Grant wrote >> Sorry (and to Offray) for the trouble, but thanks for persevering. > > Not at all! Thanks for updating the library :) > > > Alistair Grant wrote >> Would you mind setting a breakpoint in >> AKGOSProcess>>command:arguments:, printing the command and arguments >> and making sure that the --user-data-dir exists? > > Sure: > command = '/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome' > "(after I added the leading slash)". I've added the slash, so hopefully you won't have to do this again. > args = #('--user-data-dir=/tmp/pharo/GoogleChrome/debugSession' > '--remote-debugging-port=9222') > > > Alistair Grant wrote >> and making sure that the --user-data-dir exists? > > It does. > > Also, as a sanity check, the following works: > OSSUnixSubprocess new > command: 'open'; > arguments: { 'http://www.pharo.org' }; > run > As well as from the Terminal: > $ /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome > --user-data-dir=/tmp/pharo/GoogleChrome/debugSession > --remote-debugging-port=9222 > > > Alistair Grant wrote >> Also, as Mariano requested, can you confirm that it is MacOS, which >> version of Pharo, and a 32 bit VM? > > I think I'm on 32 bit. I created a fresh 6.1 from Launcher. How can one tell > for sure if one is in a 32 vs 64 bit image? OSPlatform current isUnix32 But if the OSSUnixSubprocess command above is working it must be 32 bit. I'm struggling to figure this one out. Sorry for making you do all the work, but the only things I can think of at the moment are: 1. If chrome is your default browser, can you try replacing the explicit command with "open" since it seems to work above, i.e. in OSXChromePlatform class>>openChromeWith: just set executableLocation := 'open'. 2. Try just opening the browser without any optional arguments, i,e, OSSUnixSubprocess new command: '/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome'; arguments: { 'http://www.pharo.org' }; run > Latest update: #60520 > > Operating System/Hardware > ------------------------- > Mac OS 1013.1 intel > > Virtual Machine > --------------- > /Users/sean/Documents/Pharo/vms/61-x86/Pharo.app/Contents/MacOS/Pharo > CoInterpreter VMMaker.oscog-eem.2254 uuid: > 4f2c2cce-f4a2-469a-93f1-97ed941df0ad Jul 20 2017 > StackToRegisterMappingCogit VMMaker.oscog-eem.2252 uuid: > 2f3e9b0e-ecd3-4adf-b092-cce2e2587a5c Jul 20 2017 > VM: 201707201942 https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ > Date: Thu Jul 20 12:42:21 2017 -0700 $ Plugins: 201707201942 > https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ > > Mac OS X built on Jul 20 2017 21:45:23 UTC Compiler: 4.2.1 Compatible Apple > LLVM 6.1.0 (clang-602.0.53) > VMMaker versionString VM: 201707201942 > https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ Date: Thu Jul 20 > 12:42:21 2017 -0700 $ Plugins: 201707201942 > https://github.com/OpenSmalltalk/opensmalltalk-vm.git $ > CoInterpreter VMMaker.oscog-eem.2254 uuid: > 4f2c2cce-f4a2-469a-93f1-97ed941df0ad Jul 20 2017 > StackToRegisterMappingCogit VMMaker.oscog-eem.2252 uuid: > 2f3e9b0e-ecd3-4adf-b092-cce2e2587a5c Jul 20 2017 > > > > ----- > Cheers, > Sean > -- > Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html > |
Administrator
|
Alistair Grant wrote
> Sorry for making you do all the work Not at all; happy to help. It takes a village! BTW I tracked it down to the spaces in the command path. IIRC from my OSP hacking days, it probably has something to do with the path not being run through the shell to interpret the $\s. ----- Cheers, Sean -- Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
Cheers,
Sean |
Hi Sean,
On 15 November 2017 at 23:04, Sean P. DeNigris <[hidden email]> wrote: > Alistair Grant wrote >> Sorry for making you do all the work > > Not at all; happy to help. It takes a village! BTW I tracked it down to the > spaces in the command path. IIRC from my OSP hacking days, it probably has > something to do with the path not being run through the shell to interpret > the $\s. I'm glad (and relieved :-)) to hear that it is working. Would you mind sending the modified command path that you're using so I can update the code? (I guess that it is just removing the backspaces, but just in case...). Thanks! Alistair |
Administrator
|
Alistair Grant wrote
> I'm glad (and relieved :-)) to hear that it is working. > > Would you mind sending the modified command path that you're using so > I can update the code? (I guess that it is just removing the > backspaces, but just in case...). That is correct… OSSUnixSubprocess new command: '/Applications/Google Chrome.app/Contents/MacOS/Google Chrome'; arguments: #('--user-data-dir=/tmp/pharo/GoogleChrome/debugSession' '--remote-debugging-port=9222'); run. ----- Cheers, Sean -- Sent from: http://forum.world.st/Pharo-Smalltalk-Users-f1310670.html
Cheers,
Sean |
Free forum by Nabble | Edit this page |