Hi,
This is fantastic news on several levels - the new JIT itself, but even more the new approach, process, infrastructure and tests. (Sorry, but this is a long/techncal mail). A week ago I was one of the first people outside the development team to be able to test the new ARM64 JIT VM on hardware they did not even test on. In particular I used an Amazon AWS EC2 T4g.micro instance (1 GB) with Ubuntu Server 20.04.1 LTS. These machines use an ARM64 CPU (AWS Graviton2, Neoverse N1, Cortex-A76, ARM v8). https://aws.amazon.com/ec2/graviton/ https://aws.amazon.com/ec2/instance-types/t4/ https://en.wikipedia.org/wiki/Annapurna_Labs#AWS_Graviton2 ubuntu@ip-172-30-0-23:~/test$ uname -a Linux ip-172-30-0-23 5.4.0-1030-aws #31-Ubuntu SMP Fri Nov 13 11:42:04 UTC 2020 aarch64 aarch64 aarch64 GNU/Linux ubuntu@ip-172-30-0-23:~/test$ lscpu Architecture: aarch64 CPU op-mode(s): 32-bit, 64-bit Byte Order: Little Endian CPU(s): 2 On-line CPU(s) list: 0,1 Thread(s) per core: 1 Core(s) per socket: 2 Socket(s): 1 NUMA node(s): 1 Vendor ID: ARM Model: 1 Model name: Neoverse-N1 Stepping: r3p1 BogoMIPS: 243.75 L1d cache: 128 KiB L1i cache: 128 KiB L2 cache: 2 MiB L3 cache: 32 MiB NUMA node0 CPU(s): 0,1 Vulnerability Itlb multihit: Not affected Vulnerability L1tf: Not affected Vulnerability Mds: Not affected Vulnerability Meltdown: Not affected Vulnerability Spec store bypass: Mitigation; Speculative Store Bypass disabled via prctl Vulnerability Spectre v1: Mitigation; __user pointer sanitization Vulnerability Spectre v2: Not affected Vulnerability Srbds: Not affected Vulnerability Tsx async abort: Not affected Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm lrcpc dcpop asimddp ssbs My reaction after one hour ? Wow, wow, wow, this is incredible. It all just works and it seems pretty fast as well. I played with the vm/image for a couple of minutes and so far everything worked as expected and I had no crashes at all. The order of magnitude of 1 tinyBenchmarks is very similar to other (server) machines: "'1894542090 bytecodes/sec; 146296146 sends/sec'" "arm64" "'2767567567 bytecodes/sec; 258718969 sends/sec'" "macOS" "'1227082085 bytecodes/sec; 109422120 sends/sec'" "aws" "'2101590559 bytecodes/sec; 166532391 sends/sec'" "t3 lxd" Here is a benchmark in the HTTP space, how fast can ZnServer respond to multiple concurrent requests over the local network: $ ./pharo Pharo.image eval --no-quit 'ZnServer startDefaultOn: 1701' & $ ab -k -n 1024 -c 8 http://localhost:1701/small This is ApacheBench, Version 2.3 <$Revision: 1843412 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Server Software: Zinc Server Hostname: localhost Server Port: 1701 Document Path: /small Document Length: 124 bytes Concurrency Level: 8 Time taken for tests: 0.268 seconds Complete requests: 1024 Failed requests: 0 Keep-Alive requests: 1024 Total transferred: 317440 bytes HTML transferred: 126976 bytes Requests per second: 3814.45 [#/sec] (mean) Time per request: 2.097 [ms] (mean) Time per request: 0.262 [ms] (mean, across all concurrent requests) Transfer rate: 1154.76 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 0 2 19.0 0 267 Waiting: 0 2 19.0 0 267 Total: 0 2 19.0 0 267 Percentage of the requests served within a certain time (ms) 50% 0 66% 0 75% 0 80% 0 90% 0 95% 0 98% 0 99% 42 100% 267 (longest request) That is 3800 req/s with 8 concurrent threads, each response 124 bytes. And the output document is dynamically generated each time ! Now a cached static binary document, first a small one (64 bytes): $ ab -k -n 1024 -c 8 http://localhost:1701/bytes This is ApacheBench, Version 2.3 <$Revision: 1843412 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Server Software: Zinc Server Hostname: localhost Server Port: 1701 Document Path: /bytes Document Length: 64 bytes Concurrency Level: 8 Time taken for tests: 0.214 seconds Complete requests: 1024 Failed requests: 0 Keep-Alive requests: 1024 Total transferred: 256000 bytes HTML transferred: 65536 bytes Requests per second: 4778.62 [#/sec] (mean) Time per request: 1.674 [ms] (mean) Time per request: 0.209 [ms] (mean, across all concurrent requests) Transfer rate: 1166.65 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 0 2 12.3 0 207 Waiting: 0 2 12.3 0 207 Total: 0 2 12.3 0 207 Percentage of the requests served within a certain time (ms) 50% 0 66% 0 75% 0 80% 0 90% 0 95% 0 98% 5 99% 64 100% 207 (longest request) That is 4700 req/s Now a larger one, 1024 bytes: $ ab -k -n 1024 -c 8 http://localhost:1701/bytes/1024 This is ApacheBench, Version 2.3 <$Revision: 1843412 $> Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/ Licensed to The Apache Software Foundation, http://www.apache.org/ Server Software: Zinc Server Hostname: localhost Server Port: 1701 Document Path: /bytes/1024 Document Length: 1024 bytes Concurrency Level: 8 Time taken for tests: 0.228 seconds Complete requests: 1024 Failed requests: 0 Keep-Alive requests: 1024 Total transferred: 1241088 bytes HTML transferred: 1048576 bytes Requests per second: 4484.93 [#/sec] (mean) Time per request: 1.784 [ms] (mean) Time per request: 0.223 [ms] (mean, across all concurrent requests) Transfer rate: 5308.34 [Kbytes/sec] received Connection Times (ms) min mean[+/-sd] median max Connect: 0 0 0.0 0 0 Processing: 0 2 16.2 0 227 Waiting: 0 2 16.2 0 227 Total: 0 2 16.2 0 227 Percentage of the requests served within a certain time (ms) 50% 0 66% 0 75% 0 80% 0 90% 0 95% 0 98% 0 99% 41 100% 227 (longest request) Still 4400 req/s - 1024 requests finished in about 0.25 seconds, transferring 1Mb. These are really good numbers ! Of course, with an actual network between the client and the server, the numbers are much lower, but this localhost benchmark gives you an idea of the maximum speed that is theoretically possible. And under this load, the image+vm remained totally stable. Since then I started a public Zn HTTP server, Pharo 9 live on an ARM64 JIT VM: http://34.245.183.130:1701 http://34.245.183.130:1701/status http://34.245.183.130:1701/help I want to see how long it keeps running. You can test this for yourself. Idle load of the VM is also excellent: 0.3% CPU. Overall machine load is 0.00 ! I fiddled around a bit more the next day. Everything kept on going smooth, nothing unexpected happened. I could use the command line tools to save a new image, load several of my packages using metacello from github using a baseline and tonel, I could run all test suites just fine. Everything was just *fast*. I installed NeoConsole so that I could use telnet to get a REPL into a running image. A week later, the server Pharo image was still running fine. In summary: Great, great work and thanks again everyone for the effort. You (in particular Pablo, Guille, Esteban and the whole Pharo community) can be very proud of this achievement. Sven > On 12 Dec 2020, at 17:31, Stéphane Ducasse <[hidden email]> wrote: > > Dear happy Pharoers and others, > > Over the last months we have been working on the implementation of a MIT-licensed ARMv8 Just-In-Time compiler for Pharo VM. > We are very happy with the advance on this subject, as we have not only implemented a new backend but we have added more than 2500 > tests to the JIT, the primitives, code generation, plugins and the VM in general. In the process we fixed many problems. > It shows that following the roadmap that was decided during General assembly of PharoDays, we have been investing in the Pharo VM and that > our efforts are greatly paying off. > Also, we are generating a lot of documentation and improving the process to really democratize the development of the VM. > So Pharo is a better shape than ever on such matter and this opens a lot of possibility for the future. > > ## Call for Beta-Testers > > We would like to announce that a first version of our JIT backend is available for beta-testing for ARM Linux machines using Ubuntu. > We are now entering into a beta testing stage of the VM and the image in Ubuntu ARM 64. > We would like to invite you to our beta testing phase for the VM. If you're interested in participating, > please contact [hidden email]. > > The following sections give more details of the current status, and the following steps including Apple Silicon, Windows ARM64 and Linux Open Build System support. > > ## Current Status > > Our objective is to have a running JIT for the new aarch64 architecture (ARM 64bits). This task includes not only a new backend for the JIT compiler but also adding support for external libraries, dependencies and the build process. > This means having a working VM with comparable features as the one existing in Intel X64. We are targeting all the major operating systems running in this platform (Linux, OSX, Windows). > Each of them present different restrictions and conditions. > > This is the current status: > > - We implemented a full backend for the JIT compiler targeting aarch64. > - All the image side was adapted to run on it, tested on Ubuntu ARM 64 bits. > - We added support for: Iceberg (Libgit) / Athens (Cairo) / SDL / GTK > - We implemented a LibFFI-based FFI backend as the default one for Pharo 9 in aarch64 (next to come in all platforms). > This opens the door to ease port the features to other platforms and OSes. > > ## Following Steps and Open Betas: Linux Open Build System (OBS), Windows ARM64 and Apple Silicon > > Linux Systems: In the following days, we will also support Raspbian (Debian) and Manjaro on ARM64. For doing so, we are pushing the last details in having a single Linux build system through OBS. So, if you want to start doing beta-testing of these versions please contact us. A public beta will be open in around two weeks. > > Windows Systems: We have extended the build process to fully support Microsoft Visual Studio compilers and more flexibility to select the targets, also we are building it to run in Windows ARM. To correctly run the VM in Windows it is needed to build all dependencies for aarch64. In the following weeks, we expect to have a working Non-JIT version and a JIT version. The remaining points to have a JIT version are related with the build process as the API of the operating system has not changed from X64 to aarch64. > > OSX Systems: Our third target is to have a working version for the newest Apple silicon. We are acquiring the corresponding hardware to test and to address the differences in the API exposed to JIT applications. As it is the case of the Windows VM, there is not need to change the machine code generation backend; but to compile external libraries, and particularities of the new OS version. > > Thanks for your support, and again, if you like to start beta testing the VM please contact us. In the meantime, we will continue giving you news about the current state and where are we going. > > The consortium would like to particularly thank Schmidt Buro and Lifeware for their contracts. > > Regards, > > Pablo in behalf of the Pharo Consortium Engineers > -------------------------------------------- > Stéphane Ducasse > http://stephane.ducasse.free.fr / http://www.pharo.org > 03 59 35 87 52 > Assistant: Aurore Dalle > FAX 03 59 57 78 50 > TEL 03 59 35 86 16 > S. Ducasse - Inria > 40, avenue Halley, > Parc Scientifique de la Haute Borne, Bât.A, Park Plaza > Villeneuve d'Ascq 59650 > France > -- Sven Van Caekenberghe Proudly supporting Pharo http://pharo.org http://association.pharo.org http://consortium.pharo.org |
Free forum by Nabble | Edit this page |