GPGPU

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|

GPGPU

CdAB63
Hello,

I'm considering to build a package to allow the development of math
using GPUs (NVIDIA). Would like to know about general interest in this
topic and some guidelines to put it in squeak trunk.

Best regards,

CdAB




signature.asc (268 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

Andreas.Raab
Casimiro de Almeida Barreto wrote:
> I'm considering to build a package to allow the development of math
> using GPUs (NVIDIA). Would like to know about general interest in this
> topic and some guidelines to put it in squeak trunk.

I'm only responding to the trunk question here since I can't speak for
others about their GPGPU interests (I have a floating interest in it but
I've never spent much time trying things out).

For the trunk inclusion, I think that at this point we're trying to get
the trunk smaller rather than larger. Even though we have been adding a
few packages, most of them were low-impact and very high-value for the
community at large (think syntax highlighting for example). Outside of
that we're trying to make things unloadable so that the trunk can
actually get smaller instead of larger. From this point of view the
inclusion of a GPGPU package seems unlikely.

However, there is an open question of what set of packages we will
redistribute with the next release and how to determine what should be
in that set of packages. We haven't really had this discussion - there
are some open question as to which mechanism to use, so I would say that
if you'd like to have your package be ready for distribution with the
next Squeak release you:
a) Register the project on http://www.squeaksource.com/
b) Develop it and invite others to join you
c) Make sure that it works well and that people are aware of it

In particular considering the last point, when the time comes to discuss
this issue you have something to point to, something that hopefully
others use as well, that will be compelling and an obvious choice for
distribution.

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: Re: GPGPU

CdAB63
Em 28-10-2009 02:14, Andreas Raab escreveu:

> Casimiro de Almeida Barreto wrote:
>> I'm considering to build a package to allow the development of math
>> using GPUs (NVIDIA). Would like to know about general interest in this
>> topic and some guidelines to put it in squeak trunk.
>
> I'm only responding to the trunk question here since I can't speak for
> others about their GPGPU interests (I have a floating interest in it
> but I've never spent much time trying things out).
>
> For the trunk inclusion, I think that at this point we're trying to
> get the trunk smaller rather than larger. Even though we have been
> adding a few packages, most of them were low-impact and very
> high-value for the community at large (think syntax highlighting for
> example). Outside of that we're trying to make things unloadable so
> that the trunk can actually get smaller instead of larger. From this
> point of view the inclusion of a GPGPU package seems unlikely.
The idea is not really "include/put in trunk" (since the idea of trunk
was so well expressed in above paragraphs and previous discussions in
this list) but avoiding to create something that brings conflict
(dependencies, use of deprecated stuff, etc).

GPGPU stuff is not essential and I don't think it will impact other
packages except for the need to use FFI or alien. Since CUDA has not MIT
license and up to the moment OpenCL is "Apple stuff" (regarding to
license) and is no more than a framework, one possible approach is to
use Open GL Mesa/glew/glut to get things done. Up to the moment FFI is
the choice since it works well with trunk. I guess that if FFI keeps ok
with trunk I'll have no problems.

Well. It's 02:51AM so I guess I'll take a nap... :D

>
> However, there is an open question of what set of packages we will
> redistribute with the next release and how to determine what should be
> in that set of packages. We haven't really had this discussion - there
> are some open question as to which mechanism to use, so I would say
> that if you'd like to have your package be ready for distribution with
> the next Squeak release you:
> a) Register the project on http://www.squeaksource.com/
> b) Develop it and invite others to join you
> c) Make sure that it works well and that people are aware of it
>
> In particular considering the last point, when the time comes to
> discuss this issue you have something to point to, something that
> hopefully others use as well, that will be compelling and an obvious
> choice for distribution.
>
> Cheers,
>   - Andreas
>
>
Cheers,

CdAB




signature.asc (268 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Re: GPGPU

Josh Gargus

On Oct 27, 2009, at 9:51 PM, Casimiro de Almeida Barreto wrote:

> Em 28-10-2009 02:14, Andreas Raab escreveu:
>> Casimiro de Almeida Barreto wrote:
>>> I'm considering to build a package to allow the development of math
>>> using GPUs (NVIDIA). Would like to know about general interest in  
>>> this
>>> topic and some guidelines to put it in squeak trunk.
>>
>> I'm only responding to the trunk question here since I can't speak  
>> for
>> others about their GPGPU interests (I have a floating interest in it
>> but I've never spent much time trying things out).
>>
>> For the trunk inclusion, I think that at this point we're trying to
>> get the trunk smaller rather than larger. Even though we have been
>> adding a few packages, most of them were low-impact and very
>> high-value for the community at large (think syntax highlighting for
>> example). Outside of that we're trying to make things unloadable so
>> that the trunk can actually get smaller instead of larger. From this
>> point of view the inclusion of a GPGPU package seems unlikely.
> The idea is not really "include/put in trunk" (since the idea of trunk
> was so well expressed in above paragraphs and previous discussions in
> this list) but avoiding to create something that brings conflict
> (dependencies, use of deprecated stuff, etc).


As you say below, there should be minimal dependencies on the trunk.

>
> GPGPU stuff is not essential and I don't think it will impact other
> packages except for the need to use FFI or alien. Since CUDA has not  
> MIT
> license and up to the moment OpenCL is "Apple stuff" (regarding to
> license)


It's not "Apple stuff".  Nvidia has a conformant OpenCL 1.0  
implementation (for Windows, and I believe Linux), and AMD has one in  
beta.

There is no licensing issue, any more than there is for OpenGL.  
There's no problem with writing an MIT-licensed program that  
dynamically links (or uses FFI, same thing) against Apple's OpenGL  
implementation.  Croquet does this, as do many other Open Source  
programs.

Of course, depending on the vendor, there may be restrictions on  
redistributing their DLL.  Apple probably wouldn't let you (but they  
wouldn't need to, since OpenCL is always there on 10.6).  Nvidia  
probably would let you, just as they do with "cg.dll" (but you'd have  
to actually look at their OpenCL license; I haven't).

There is a technical issue of how to do the "linking" on Windows.  For  
OpenGL, there is a file "opengl32.dll" that acts as a front-end for  
drivers provided by Nvidia, AMD, etc.  I don't believe that this  
exists yet for OpenCL; for now you'd have to use  
"nvidia_opencl.dll" (or whatever the name actually is) explicitly in  
the FFI spec.  Croquet has code in its FFI binding to OpenGL that can  
be adapted to deal with this.


> and is no more than a framework, one possible approach is to
> use Open GL Mesa/glew/glut to get things done.


Trust me, you don't want to do that.  Especially if you don't know  
that you don't want to do that ;-)

If you don't believe me, first go to gpgpu.org and look in the forums  
for links to old framework code that abstracts away *some* of the pain  
of GPGPU-on-OpenGL (I think that Mark Harris wrote one).  Read that  
code, and some programs that use it.  Then, read the OpenCL 1.0 spec.  
You will be convinced, I promise ;-)

Anyway, it sounds like a cool project.  I hope that you keep us  
updated with your progress!

Cheers,
Josh



> Up to the moment FFI is
> the choice since it works well with trunk. I guess that if FFI keeps  
> ok
> with trunk I'll have no problems.
>
> Well. It's 02:51AM so I guess I'll take a nap... :D
>>
>> However, there is an open question of what set of packages we will
>> redistribute with the next release and how to determine what should  
>> be
>> in that set of packages. We haven't really had this discussion -  
>> there
>> are some open question as to which mechanism to use, so I would say
>> that if you'd like to have your package be ready for distribution  
>> with
>> the next Squeak release you:
>> a) Register the project on http://www.squeaksource.com/
>> b) Develop it and invite others to join you
>> c) Make sure that it works well and that people are aware of it
>>
>> In particular considering the last point, when the time comes to
>> discuss this issue you have something to point to, something that
>> hopefully others use as well, that will be compelling and an obvious
>> choice for distribution.
>>
>> Cheers,
>>  - Andreas
>>
>>
> Cheers,
>
> CdAB
>
>


Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

Stefan Marr
In reply to this post by CdAB63
Hi:

On 27 Oct 2009, at 20:46, Casimiro de Almeida Barreto wrote:
> I'm considering to build a package to allow the development of math
> using GPUs (NVIDIA).
Do you have any intention to do something else then math on the GPUs?
I guess you just want to provide a wrapper for the libraries, e.g.,  
OpenCL, right?

Would it be feasible to run some Smalltalk code on it, maybe in a  
restricted dialect, like Slang?

Best regards
Stefan


--
Stefan Marr
Software Languages Lab
Former Programming Technology Lab
Vrije Universiteit Brussel
Pleinlaan 2 / B-1050 Brussels / Belgium
http://prog.vub.ac.be/~smarr
Phone: +32 2 629 3956
Fax:   +32 2 629 3525


Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

CdAB63
Em 28-10-2009 10:49, Stefan Marr escreveu:
> Hi:
>
> On 27 Oct 2009, at 20:46, Casimiro de Almeida Barreto wrote:
>> I'm considering to build a package to allow the development of math
>> using GPUs (NVIDIA).
> Do you have any intention to do something else then math on the GPUs?
> I guess you just want to provide a wrapper for the libraries, e.g.,
> OpenCL, right?
The current idea is to build a package to deal with  floating point
vector processing. So, that's more than creating wrapper libraries. I've
been asked to do so by people intending to work with engineering
applications (can't say more due NDA issues) and thought it useful for
my own purposes (like help building efficient native ANN package).

Before considering this project I was dealing with the Cell Processor
(IBM had an agreement with the lab I used to work for). But after
changes on IBM policy for the Cell processor and after Sony decided to
remove support for installing foreign OSs in PS3 boards it became clear
that working with this platform wouldn't be useful (really a pity).

On the other hand, working with GPUs I have some alternatives. Hardware
is abundant and low priced and support is relatively easy to find.

About SDKs, NVIDIA has the CUDA environment (libraries + SDK). It is not
open software (NVIDIA EULA) and restricts things to NVIDIA. Apple/Kronos
have the OpenCL but afaik tools are currently for Mac OS X. OpenCL is
also not open software (Apple License) but have wider support (NVIDIA/AMD).
>
> Would it be feasible to run some Smalltalk code on it, maybe in a
> restricted dialect, like Slang?
Currently I don't think so. Available GPU resources are very specific:
they're extremely efficient vector processors with several nice pipelines...
>
> Best regards
> Stefan
>
>
Best regards,

CdAB




signature.asc (268 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

Josh Gargus
In reply to this post by Stefan Marr
I agree with Casmiro's response... GPUs aren't suitable for running  
Smalltalk code.  Larrabee might be interesting, since it will have 16  
or more x86 processors, but it's difficult to see how to utilize the  
powerful vector processor attached to each x86.

Your question was more specifically about running something like Slang  
on it.  It's important to remember that Slang isn't Smalltalk, it's C  
with Smalltalk syntax (i.e. all Slang language constructs are  
implemented by a simple 1-1 mapping onto the corresponding C language  
feature).  So yes, it would be possible to run something like Slang on  
a GPU.  Presumably, you would want to take the integration one step  
farther than with Slang, and automatically compile the generated  
OpenCL or CUDA code instead of dumping it to an external file.

Instead of thinking of running Smalltalk on the GPU, I would think  
about writing a DSL (domain-specific language) for a particular class  
of problems that can be solved well on the GPU.  Then I would think  
about how to integrate this DSL nicely into Smalltalk.

Sean McDermid has done something like this with C#, LINQ, HLSL, and  
Direct3D (http://bling.codeplex.com/).  He's not doing GPGPU per se,  
but the point is how seamless is his integration with C#.

Cheers,
Josh



On Oct 28, 2009, at 5:49 AM, Stefan Marr wrote:

> Hi:
>
> On 27 Oct 2009, at 20:46, Casimiro de Almeida Barreto wrote:
>> I'm considering to build a package to allow the development of math
>> using GPUs (NVIDIA).
> Do you have any intention to do something else then math on the GPUs?
> I guess you just want to provide a wrapper for the libraries, e.g.,  
> OpenCL, right?
>
> Would it be feasible to run some Smalltalk code on it, maybe in a  
> restricted dialect, like Slang?
>
> Best regards
> Stefan
>
>
> --
> Stefan Marr
> Software Languages Lab
> Former Programming Technology Lab
> Vrije Universiteit Brussel
> Pleinlaan 2 / B-1050 Brussels / Belgium
> http://prog.vub.ac.be/~smarr
> Phone: +32 2 629 3956
> Fax:   +32 2 629 3525
>
>


Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

CdAB63
Em 28-10-2009 15:24, Josh Gargus escreveu:
> I agree with Casmiro's response... GPUs aren't suitable for running
> Smalltalk code.  Larrabee might be interesting, since it will have 16
> or more x86 processors, but it's difficult to see how to utilize the
> powerful vector processor attached to each x86.
Here I see two opportunities. The first would be to follow the advice of
mr. Ingalls and start to develop a generic VM and related classes to
deal with parallel processing (something I think is extremely delayed
since multicore processors are around for such a long time) and IMHO,
not dealing with SMP processing prevents dealing with NUMA processing
where the advantages of smalltalk should be astounding.

The second is to provide squeak with solid intrinsic vector processing
capabilities which would reopen the field of high performance
applications in science and engineering and also for more mundane
applications like game industry.

>
> Your question was more specifically about running something like Slang
> on it.  It's important to remember that Slang isn't Smalltalk, it's C
> with Smalltalk syntax (i.e. all Slang language constructs are
> implemented by a simple 1-1 mapping onto the corresponding C language
> feature).  So yes, it would be possible to run something like Slang on
> a GPU.  Presumably, you would want to take the integration one step
> farther than with Slang, and automatically compile the generated
> OpenCL or CUDA code instead of dumping it to an external file.
>
> Instead of thinking of running Smalltalk on the GPU, I would think
> about writing a DSL (domain-specific language) for a particular class
> of problems that can be solved well on the GPU.  Then I would think
> about how to integrate this DSL nicely into Smalltalk.
That's sort of my idea :)

I'm not considering CUDA at the moment because it would be more specific
to NVIDIA architecture. Currently the GPU market is shared mostly
between NVIDA and AMD/ATI and AMD says they won't support CUDA on their
GPUs (just follow
http://www.amdzone.com/index.php/news/video-cards/11775-no-cuda-on-radeon as
an example). It's a pitty since last year it was reported that RADEON
compatibility in CUDA was almost complete. Besides there are licensing
issues and I just don't want to have "wrappers".

It's obvious that I know many of the problems dealt by CUDA and OpenCL:
the variable number and size of pipelines, problems with numeric
representation and FP precision, etc... etc... etc... And I know it
would be much easier just to write some wrappers or, easier yet, to
develop things in C/C++ & glue them with FFI. But then, what would be
the gain to squeak & the smalltalk community?
>
> Sean McDermid has done something like this with C#, LINQ, HLSL, and
> Direct3D (http://bling.codeplex.com/).  He's not doing GPGPU per se,
> but the point is how seamless is his integration with C#.
>
> Cheers,
> Josh
>
Best regards,

CdAB




signature.asc (268 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: GPGPU

Chris Hogan
hmm,   could you just plop the vm on top of Barrelfish and let it do all the fancy multi-processor stuff for you?

http://www.linux-magazine.com/Online/News/Barrelfish-Multikernel-Operating-System-out-of-Zurich

http://www.barrelfish.org/



Chris Hogan



> Date: Wed, 28 Oct 2009 18:23:20 -0200
> From: [hidden email]
> To: [hidden email]
> Subject: Re: [squeak-dev] GPGPU
>
> Em 28-10-2009 15:24, Josh Gargus escreveu:
> > I agree with Casmiro's response... GPUs aren't suitable for running
> > Smalltalk code. Larrabee might be interesting, since it will have 16
> > or more x86 processors, but it's difficult to see how to utilize the
> > powerful vector processor attached to each x86.
> Here I see two opportunities. The first would be to follow the advice of
> mr. Ingalls and start to develop a generic VM and related classes to
> deal with parallel processing (something I think is extremely delayed
> since multicore processors are around for such a long time) and IMHO,
> not dealing with SMP processing prevents dealing with NUMA processing
> where the advantages of smalltalk should be astounding.
>
> The second is to provide squeak with solid intrinsic vector processing
> capabilities which would reopen the field of high performance
> applications in science and engineering and also for more mundane
> applications like game industry.
> >
> > Your question was more specifically about running something like Slang
> > on it. It's important to remember that Slang isn't Smalltalk, it's C
> > with Smalltalk syntax (i.e. all Slang language constructs are
> > implemented by a simple 1-1 mapping onto the corresponding C language
> > feature). So yes, it would be possible to run something like Slang on
> > a GPU. Presumably, you would want to take the integration one step
> > farther than with Slang, and automatically compile the generated
> > OpenCL or CUDA code instead of dumping it to an external file.
> >
> > Instead of thinking of running Smalltalk on the GPU, I would think
> > about writing a DSL (domain-specific language) for a particular class
> > of problems that can be solved well on the GPU. Then I would think
> > about how to integrate this DSL nicely into Smalltalk.
>
> That's sort of my idea :)
>
> I'm not considering CUDA at the moment because it would be more specific
> to NVIDIA architecture. Currently the GPU market is shared mostly
> between NVIDA and AMD/ATI and AMD says they won't support CUDA on their
> GPUs (just follow
> http://www.amdzone.com/index.php/news/video-cards/11775-no-cuda-on-radeon as
> an example). It's a pitty since last year it was reported that RADEON
> compatibility in CUDA was almost complete. Besides there are licensing
> issues and I just don't want to have "wrappers".
>
> It's obvious that I know many of the problems dealt by CUDA and OpenCL:
> the variable number and size of pipelines, problems with numeric
> representation and FP precision, etc... etc... etc... And I know it
> would be much easier just to write some wrappers or, easier yet, to
> develop things in C/C++ & glue them with FFI. But then, what would be
> the gain to squeak & the smalltalk community?
> >
> > Sean McDermid has done something like this with C#, LINQ, HLSL, and
> > Direct3D (http://bling.codeplex.com/). He's not doing GPGPU per se,
> > but the point is how seamless is his integration with C#.
> >
> > Cheers,
> > Josh
> >
> Best regards,
>
> CdAB
>


New Windows 7: Find the right PC for you. Learn more.

Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

CdAB63
Em 29-10-2009 09:40, Christopher Hogan escreveu:
hmm,   could you just plop the vm on top of Barrelfish and let it do all the fancy multi-processor stuff for you?

http://www.linux-magazine.com/Online/News/Barrelfish-Multikernel-Operating-System-out-of-Zurich

http://www.barrelfish.org/


Nope.

Some issues: currently, in squeak, there's no way to have objects running/living independently in different processors (meaning, among other things, that squeakVM is a kind of "microkernel" with really limited preemptivity & syncing mechanisms). If you think in separate (disjunct) memory spaces things get even worse. Even when you fork, processes are not "run independently" and if you really want to run things independently you have to use things from CommandShell and OSProcess/OSProcessPlugin and OS pipelines.

I think that VM should be re-engineered in order to allow instances running in different processors/memory spaces and communicating via some protocol. The challenges are big: syncing, security, performance optimization, garbage collection, etc...

I thought to propose something like this for PhD (btw, I really did) but people I know at university is so fond of Java & python & other "small stuff" like "jackpot projects"... :( So, if I am to work with this I have to find a way of funding myself through commercial projects... perhaps leaving this entrepreneurial desert called BR.

Cheers,

CdAB

Chris Hogan



> Date: Wed, 28 Oct 2009 18:23:20 -0200
> From: [hidden email]
> To: [hidden email]
> Subject: Re: [squeak-dev] GPGPU
>
> Em 28-10-2009 15:24, Josh Gargus escreveu:
> > I agree with Casmiro's response... GPUs aren't suitable for running
> > Smalltalk code. Larrabee might be interesting, since it will have 16
> > or more x86 processors, but it's difficult to see how to utilize the
> > powerful vector processor attached to each x86.
> Here I see two opportunities. The first would be to follow the advice of
> mr. Ingalls and start to develop a generic VM and related classes to
> deal with parallel processing (something I think is extremely delayed
> since multicore processors are around for such a long time) and IMHO,
> not dealing with SMP processing prevents dealing with NUMA processing
> where the advantages of smalltalk should be astounding.
>
> The second is to provide squeak with solid intrinsic vector processing
> capabilities which would reopen the field of high performance
> applications in science and engineering and also for more mundane
> applications like game industry.
> >
> > Your question was more specifically about running something like Slang
> > on it. It's important to remember that Slang isn't Smalltalk, it's C
> > with Smalltalk syntax (i.e. all Slang language constructs are
> > implemented by a simple 1-1 mapping onto the corresponding C language
> > feature). So yes, it would be possible to run something like Slang on
> > a GPU. Presumably, you would want to take the integration one step
> > farther than with Slang, and automatically compile the generated
> > OpenCL or CUDA code instead of dumping it to an external file.
> >
> > Instead of thinking of running Smalltalk on the GPU, I would think
> > about writing a DSL (domain-specific language) for a particular class
> > of problems that can be solved well on the GPU. Then I would think
> > about how to integrate this DSL nicely into Smalltalk.
>
> That's sort of my idea :)
>
> I'm not considering CUDA at the moment because it would be more specific
> to NVIDIA architecture. Currently the GPU market is shared mostly
> between NVIDA and AMD/ATI and AMD says they won't support CUDA on their
> GPUs (just follow
> http://www.amdzone.com/index.php/news/video-cards/11775-no-cuda-on-radeon as
> an example). It's a pitty since last year it was reported that RADEON
> compatibility in CUDA was almost complete. Besides there are licensing
> issues and I just don't want to have "wrappers".
>
> It's obvious that I know many of the problems dealt by CUDA and OpenCL:
> the variable number and size of pipelines, problems with numeric
> representation and FP precision, etc... etc... etc... And I know it
> would be much easier just to write some wrappers or, easier yet, to
> develop things in C/C++ & glue them with FFI. But then, what would be
> the gain to squeak & the smalltalk community?
> >
> > Sean McDermid has done something like this with C#, LINQ, HLSL, and
> > Direct3D (http://bling.codeplex.com/). He's not doing GPGPU per se,
> > but the point is how seamless is his integration with C#.
> >
> > Cheers,
> > Josh
> >
> Best regards,
>
> CdAB
>


New Windows 7: Find the right PC for you. Learn more.




signature.asc (268 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

RE: GPGPU

Chris Hogan
Interesting.

Does the HydraVM address some of those issues?

http://squeakvm.org/~sig/hydravm/devnotes.html

Chris Hogan




Date: Thu, 29 Oct 2009 13:14:20 -0200
From: [hidden email]
To: [hidden email]
Subject: Re: [squeak-dev] GPGPU

Em 29-10-2009 09:40, Christopher Hogan escreveu:
hmm,   could you just plop the vm on top of Barrelfish and let it do all the fancy multi-processor stuff for you?

http://www.linux-magazine.com/Online/News/Barrelfish-Multikernel-Operating-System-out-of-Zurich

http://www.barrelfish.org/


Nope.

Some issues: currently, in squeak, there's no way to have objects running/living independently in different processors (meaning, among other things, that squeakVM is a kind of "microkernel" with really limited preemptivity & syncing mechanisms). If you think in separate (disjunct) memory spaces things get even worse. Even when you fork, processes are not "run independently" and if you really want to run things independently you have to use things from CommandShell and OSProcess/OSProcessPlugin and OS pipelines.

I think that VM should be re-engineered in order to allow instances running in different processors/memory spaces and communicating via some protocol. The challenges are big: syncing, security, performance optimization, garbage collection, etc...

I thought to propose something like this for PhD (btw, I really did) but people I know at university is so fond of Java & python & other "small stuff" like "jackpot projects"... :( So, if I am to work with this I have to find a way of funding myself through commercial projects... perhaps leaving this entrepreneurial desert called BR.

Cheers,

CdAB

Chris Hogan



> Date: Wed, 28 Oct 2009 18:23:20 -0200
> From: [hidden email]
> To: [hidden email]
> Subject: Re: [squeak-dev] GPGPU
>
> Em 28-10-2009 15:24, Josh Gargus escreveu:
> > I agree with Casmiro's response... GPUs aren't suitable for running
> > Smalltalk code. Larrabee might be interesting, since it will have 16
> > or more x86 processors, but it's difficult to see how to utilize the
> > powerful vector processor attached to each x86.
> Here I see two opportunities. The first would be to follow the advice of
> mr. Ingalls and start to develop a generic VM and related classes to
> deal with parallel processing (something I think is extremely delayed
> since multicore processors are around for such a long time) and IMHO,
> not dealing with SMP processing prevents dealing with NUMA processing
> where the advantages of smalltalk should be astounding.
>
> The second is to provide squeak with solid intrinsic vector processing
> capabilities which would reopen the field of high performance
> applications in science and engineering and also for more mundane
> applications like game industry.
> >
> > Your question was more specifically about running something like Slang
> > on it. It's important to remember that Slang isn't Smalltalk, it's C
> > with Smalltalk syntax (i.e. all Slang language constructs are
> > implemented by a simple 1-1 mapping onto the corresponding C language
> > feature). So yes, it would be possible to run something like Slang on
> > a GPU. Presumably, you would want to take the integration one step
> > farther than with Slang, and automatically compile the generated
> > OpenCL or CUDA code instead of dumping it to an external file.
> >
> > Instead of thinking of running Smalltalk on the GPU, I would think
> > about writing a DSL (domain-specific language) for a particular class
> > of problems that can be solved well on the GPU. Then I would think
> > about how to integrate this DSL nicely into Smalltalk.
>
> That's sort of my idea :)
>
> I'm not considering CUDA at the moment because it would be more specific
> to NVIDIA architecture. Currently the GPU market is shared mostly
> between NVIDA and AMD/ATI and AMD says they won't support CUDA on their
> GPUs (just follow
> http://www.amdzone.com/index.php/news/video-cards/11775-no-cuda-on-radeon as
> an example). It's a pitty since last year it was reported that RADEON
> compatibility in CUDA was almost complete. Besides there are licensing
> issues and I just don't want to have "wrappers".
>
> It's obvious that I know many of the problems dealt by CUDA and OpenCL:
> the variable number and size of pipelines, problems with numeric
> representation and FP precision, etc... etc... etc... And I know it
> would be much easier just to write some wrappers or, easier yet, to
> develop things in C/C++ & glue them with FFI. But then, what would be
> the gain to squeak & the smalltalk community?
> >
> > Sean McDermid has done something like this with C#, LINQ, HLSL, and
> > Direct3D (http://bling.codeplex.com/). He's not doing GPGPU per se,
> > but the point is how seamless is his integration with C#.
> >
> > Cheers,
> > Josh
> >
> Best regards,
>
> CdAB
>


New Windows 7: Find the right PC for you. Learn more.





Windows 7: It helps you do more. Explore Windows 7.

Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

CdAB63
Em 29-10-2009 14:04, Christopher Hogan escreveu:
Interesting.

Does the HydraVM address some of those issues?
HydraVM implement preemptive threads, so forks are dealt more properly over multicore processors. Note that under squeakVM forks don't create threads. Same thing with CobaltVM. Hydra works only win32 & Cobalt only Mac OS X (AFAIK).

But my considerations go further.

http://squeakvm.org/~sig/hydravm/devnotes.html

Chris Hogan


CdAB



signature.asc (268 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

Andreas.Raab
Casimiro de Almeida Barreto wrote:
> HydraVM implement preemptive threads, so forks are dealt more properly
> over multicore processors. Note that under squeakVM forks don't create
> threads. Same thing with CobaltVM. Hydra works only win32 & Cobalt only
> Mac OS X (AFAIK).

All of this is pretty much entirely wrong ;-) HydraVM allows the
creation of multiple interacting object memories where each object
memory is served by a native thread and can contain many Squeak threads.
Thus, a fork in HydraVM does not create a native thread either - the
creation of a native thread is bound to the instantiation of another
object memory. Also, HydraVM works on all the major platforms (Win, Mac,
Unix).

Cheers,
   - Andreas

Reply | Threaded
Open this post in threaded view
|

Re: Re: GPGPU

CdAB63
Em 29-10-2009 15:17, Andreas Raab escreveu:

All of this is pretty much entirely wrong ;-) HydraVM allows the creation of multiple interacting object memories where each object memory is served by a native thread and can contain many Squeak threads. Thus, a fork in HydraVM does not create a native thread either - the creation of a native thread is bound to the instantiation of another object memory. Also, HydraVM works on all the major platforms (Win, Mac, Unix).
Sometimes is a relief to be completely wrong !!! But I just quoted this:

"Availability

Currently, HydraVM runs only on Win32 platform. For making it run on different platforms you need to change platform-specific code to make it compatible with new interpreter.

There is no binaries for download publicly available yet, so you would need to download and build VM yourself."

From: http://squeakvm.org/~sig/hydravm/devnotes.html

So, I guess that the page is outdated.

I will look at it asap since it interests me a lot.


Cheers,
  - Andreas


Thnx Adreas



signature.asc (268 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Re: GPGPU

Josh Gargus
I thought Igor would respond, but since he hasn't...

IIRC, Hydra is currently Windows-only, but is designed such that the platform-specific parts are small.  The bulk of the changes (such as changing the primitive interface to support multiple threads, etc.) are cross-platform.

Cheers,
Josh


On Oct 29, 2009, at 10:32 AM, Casimiro de Almeida Barreto wrote:

Em 29-10-2009 15:17, Andreas Raab escreveu:

All of this is pretty much entirely wrong ;-) HydraVM allows the creation of multiple interacting object memories where each object memory is served by a native thread and can contain many Squeak threads. Thus, a fork in HydraVM does not create a native thread either - the creation of a native thread is bound to the instantiation of another object memory. Also, HydraVM works on all the major platforms (Win, Mac, Unix).
Sometimes is a relief to be completely wrong !!! But I just quoted this:

"Availability

Currently, HydraVM runs only on Win32 platform. For making it run on different platforms you need to change platform-specific code to make it compatible with new interpreter.

There is no binaries for download publicly available yet, so you would need to download and build VM yourself."

From: http://squeakvm.org/~sig/hydravm/devnotes.html

So, I guess that the page is outdated.

I will look at it asap since it interests me a lot.


Cheers,
  - Andreas


Thnx Adreas




Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

Josh Gargus
In reply to this post by CdAB63

On Oct 28, 2009, at 1:23 PM, Casimiro de Almeida Barreto wrote:

> Em 28-10-2009 15:24, Josh Gargus escreveu:
>> I agree with Casmiro's response... GPUs aren't suitable for running
>> Smalltalk code.  Larrabee might be interesting, since it will have 16
>> or more x86 processors, but it's difficult to see how to utilize the
>> powerful vector processor attached to each x86.
> Here I see two opportunities. The first would be to follow the  
> advice of
> mr. Ingalls and start to develop a generic VM and related classes to
> deal with parallel processing (something I think is extremely delayed
> since multicore processors are around for such a long time) and IMHO,
> not dealing with SMP processing prevents dealing with NUMA processing
> where the advantages of smalltalk should be astounding.
>
> The second is to provide squeak with solid intrinsic vector processing
> capabilities which would reopen the field of high performance
> applications in science and engineering and also for more mundane
> applications like game industry.
>>
>> Your question was more specifically about running something like  
>> Slang
>> on it.  It's important to remember that Slang isn't Smalltalk, it's C
>> with Smalltalk syntax (i.e. all Slang language constructs are
>> implemented by a simple 1-1 mapping onto the corresponding C language
>> feature).  So yes, it would be possible to run something like Slang  
>> on
>> a GPU.  Presumably, you would want to take the integration one step
>> farther than with Slang, and automatically compile the generated
>> OpenCL or CUDA code instead of dumping it to an external file.
>>
>> Instead of thinking of running Smalltalk on the GPU, I would think
>> about writing a DSL (domain-specific language) for a particular class
>> of problems that can be solved well on the GPU.  Then I would think
>> about how to integrate this DSL nicely into Smalltalk.
>
> That's sort of my idea :)
>
> I'm not considering CUDA at the moment because it would be more  
> specific
> to NVIDIA architecture. Currently the GPU market is shared mostly
> between NVIDA and AMD/ATI and AMD says they won't support CUDA on  
> their
> GPUs (just follow
> http://www.amdzone.com/index.php/news/video-cards/11775-no-cuda-on-radeon 
>  as
> an example). It's a pitty since last year it was reported that RADEON
> compatibility in CUDA was almost complete. Besides there are licensing
> issues

Again, I'm not sure what issues you are referring to.  Are you talking  
about practical issues that would prevent people from deploying Squeak  
GPGPU code?  If so, I don't think that there are any issues.  Unlike,  
say, Microsoft, the GPU vendors have much less incentive to lock you  
into their platforms; they just want to sell more GPUs, and they won't  
do that by introducing gratuitous licensing roadblocks.

Or perhaps you're more motivated by FSF-esque notions of software  
freedom?  It's true that there is a free OpenGL implementation and no  
free OpenCL implementation, yet.  However, the specification is open,  
and it's a matter of time before a free implementation is available.  
For example, my understanding is that Tungsten Graphics' Gallium3D  
framework is designed to support OpenCL as well as OpenGL.


> and I just don't want to have "wrappers".

Could you elaborate a bit about the "solid intrinsic vector processing  
capabilities" that you are thinking of, and in particular how they go  
beyond being mere wrappers?

It seems like a layered approach is the way to go.  Assuming that  
OpenCL is the target, the lowest layer (and the first useful artifact)  
would be a wrapper for the OpenCL function calls and a model for  
managing memory outside of Squeak's object-memory (perhaps using  
Aliens).

The next layer would be a more natural integration of OpenCL's  
sequencing/synchronization primitives (such as "events") into Squeak.

After that, the sky's the limit... here's where Squeak could really  
shine.

Do you agree with this characterization?


> It's obvious that I know many of the problems dealt by CUDA and  
> OpenCL:
> the variable number and size of pipelines, problems with numeric
> representation and FP precision, etc... etc... etc...


Yeah, sorry... I feel like I made bad assumptions in my initial  
response.


> And I know it
> would be much easier just to write some wrappers or, easier yet, to
> develop things in C/C++ & glue them with FFI. But then, what would be
> the gain to squeak & the smalltalk community?


That's the spirit!  :-D

Cheers,
Josh



>>
>> Sean McDermid has done something like this with C#, LINQ, HLSL, and
>> Direct3D (http://bling.codeplex.com/).  He's not doing GPGPU per se,
>> but the point is how seamless is his integration with C#.
>>
>> Cheers,
>> Josh
>>
> Best regards,
>
> CdAB
>
>


Reply | Threaded
Open this post in threaded view
|

Re: GPGPU

CdAB63
Em 30-10-2009 04:15, Josh Gargus escreveu:

On Oct 28, 2009, at 1:23 PM, Casimiro de Almeida Barreto wrote:
(...)
That's sort of my idea :)

I'm not considering CUDA at the moment because it would be more specific
to NVIDIA architecture. Currently the GPU market is shared mostly
between NVIDA and AMD/ATI and AMD says they won't support CUDA on their
GPUs (just follow
http://www.amdzone.com/index.php/news/video-cards/11775-no-cuda-on-radeon as
an example). It's a pitty since last year it was reported that RADEON
compatibility in CUDA was almost complete. Besides there are licensing
issues

Again, I'm not sure what issues you are referring to.  Are you talking about practical issues that would prevent people from deploying Squeak GPGPU code?  If so, I don't think that there are any issues.  Unlike, say, Microsoft, the GPU vendors have much less incentive to lock you into their platforms; they just want to sell more GPUs, and they won't do that by introducing gratuitous licensing roadblocks.

Or perhaps you're more motivated by FSF-esque notions of software freedom?  It's true that there is a free OpenGL implementation and no free OpenCL implementation, yet.  However, the specification is open, and it's a matter of time before a free implementation is available.  For example, my understanding is that Tungsten Graphics' Gallium3D framework is designed to support OpenCL as well as OpenGL.

There are two issues:
  • One is related to the way things are licensed and, in the long run, it is important. So important that squeak migrated from Apple license to MIT license (which would otherwise be unnecessary). One idea behind free software is that it is developed by neutral party so issues raised by AMD against CUDA (meaning, hey NVIDIA will make CUDA better for their GPUs) disappear.

  • The other is related to issues regarding to stability & portability of current GPGPU software. While CUDA is developed & ported to the three major platforms (meaning Microsoft, Mac OS X & Linux), OpenCL is under development. If you go to Khronos site you'll be able to get specs (http://www.khronos.org/registry/cl/specs/opencl-1.0.48.pdf) but there are only experimental bindings. Situation of OpenCL regarding Linux is unclear at this moment. You go to the forums and see only speculations. You go to manufacturers sites and some have some bindings to OpenCL (NVIDIA) but I really have to study them to assure they're not "flavored" and where they link to. Where's the real work (see paragraph bellow)? Besides, I work with Linux and I'm aware about the position Apple keeps against this platform.


and I just don't want to have "wrappers".

Could you elaborate a bit about the "solid intrinsic vector processing capabilities" that you are thinking of, and in particular how they go beyond being mere wrappers?

It seems like a layered approach is the way to go.  Assuming that OpenCL is the target, the lowest layer (and the first useful artifact) would be a wrapper for the OpenCL function calls and a model for managing memory outside of Squeak's object-memory (perhaps using Aliens).

The next layer would be a more natural integration of OpenCL's sequencing/synchronization primitives (such as "events") into Squeak.

After that, the sky's the limit... here's where Squeak could really shine.

Do you agree with this characterization?
Yes. The characterization is correct. Its the same for the three alternatives: OpenGL, CUDA, OpenCL

It seems that you have worked with OpenCL. If you know about Linux implementations I'd like to hear about it. Currently I know about the beta OpenCL SDK released by AMD just 3 months ago (http://developer.amd.com/GPU/ATISTREAMSDKBETAPROGRAM/Pages/default.aspx#four) and it is ported to OpenSuse/Ubuntu linux distros and limited to ATI GPUs (well, here I can be wrong about this).



It's obvious that I know many of the problems dealt by CUDA and OpenCL:
the variable number and size of pipelines, problems with numeric
representation and FP precision, etc... etc... etc...


Yeah, sorry... I feel like I made bad assumptions in my initial response.


And I know it
would be much easier just to write some wrappers or, easier yet, to
develop things in C/C++ & glue them with FFI. But then, what would be
the gain to squeak & the smalltalk community?


That's the spirit!  :-D

Cheers,
Josh




Sean McDermid has done something like this with C#, LINQ, HLSL, and
Direct3D (http://bling.codeplex.com/).  He's not doing GPGPU per se,
but the point is how seamless is his integration with C#.

Cheers,
Josh

Best regards,

CdAB









signature.asc (268 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Re: GPGPU

Igor Stasenko
In reply to this post by Josh Gargus
2009/10/30 Josh Gargus <[hidden email]>:
> I thought Igor would respond, but since he hasn't...
> IIRC, Hydra is currently Windows-only, but is designed such that the
> platform-specific parts are small.  The bulk of the changes (such as
> changing the primitive interface to support multiple threads, etc.) are
> cross-platform.

Sorry, i missed this topic.
By availability, i meant, of course, the VM binaries, _ready to use_.
The core part of VM - an interpreter is cross-platform thing, but in order to
get it working you should provide a platform-specific implementation
of various functions
which VM expecting to have. Same is applied to plugins, which require
a platform-specific functions.


> Cheers,
> Josh
>
> On Oct 29, 2009, at 10:32 AM, Casimiro de Almeida Barreto wrote:
>
> Em 29-10-2009 15:17, Andreas Raab escreveu:
>
> All of this is pretty much entirely wrong ;-) HydraVM allows the creation of
> multiple interacting object memories where each object memory is served by a
> native thread and can contain many Squeak threads. Thus, a fork in HydraVM
> does not create a native thread either - the creation of a native thread is
> bound to the instantiation of another object memory. Also, HydraVM works on
> all the major platforms (Win, Mac, Unix).
>
> Sometimes is a relief to be completely wrong !!! But I just quoted this:
>
> "Availability
>
> Currently, HydraVM runs only on Win32 platform. For making it run on
> different platforms you need to change platform-specific code to make it
> compatible with new interpreter.
>
> There is no binaries for download publicly available yet, so you would need
> to download and build VM yourself."
>
> From: http://squeakvm.org/~sig/hydravm/devnotes.html
>
> So, I guess that the page is outdated.
>
> I will look at it asap since it interests me a lot.
>
>
> Cheers,
>   - Andreas
>
>
> Thnx Adreas
>
>
>
>
>
>



--
Best regards,
Igor Stasenko AKA sig.