Hi Eliot, I got this output from configure. There is no line "******** disabling vm-sound-ALSA" Rob checking for custom display support... no ******** disabling vm-display-custom checking linux/fb.h usability... yes checking linux/fb.h presence... yes checking for linux/fb.h... yes ******** disabling vm-display-Quartz checking for X... libraries , headers checking for gethostbyname... yes checking for connect... yes checking for remove... yes checking for shmat... yes checking for IceConnectionNumber in -lICE... no checking for XOpenDisplay in -lX11... yes checking for XShmAttach in -lXext... yes checking GL/gl.h usability... yes checking GL/gl.h presence... yes checking for GL/gl.h... yes checking for glIsEnabled in -lGL... no checking for Advanced Linux Sound Architecture... yes checking for custom sound support... no ******** disabling vm-sound-custom checking for Mac OS X CoreAudio... no ******** disabling vm-sound-MacOSX checking for Network Audio System... no ******** disabling vm-sound-NAS checking for Open Sound System... yes checking for SunOS/Solaris audio... no ******** disabling vm-sound-Sun checking for MIDI support via ALSA... yes checking util.h usability... no checking util.h presence... no checking for util.h... no checking libutil.h usability... no checking libutil.h presence... no checking for libutil.h... no checking pty.h usability... yes checking pty.h presence... yes checking for pty.h... yes checking stropts.h usability... yes checking stropts.h presence... yes checking for stropts.h... yes checking for library containing openpty... -lutil ******** disabling PyBridge checking for FFI support... requires libffi checking ffi.h usability... no checking ffi.h presence... no checking for ffi.h... no ******** disabling SqueakFFIPrims checking for unsetenv... yes checking for UUID support... yes checking for uuid_generate in -luuid... yes checking for XOpenDisplay in -lX11... (cached) yes ******** disabling vm-sound-OSS ******** disabling vm-display-fbdev -------------------------------------------------- From: "Rob Withers" <[hidden email]> Sent: Saturday, July 17, 2010 2:18 PM To: "Squeak Virtual Machine Development Discussion" <[hidden email]> Subject: Re: [Vm-dev] Cog on linux > > Eliot, > > It is still giving me an ALSA error. Here is my call to configure and > make: > > ../../platforms/unix/config/configure --without-vm-sound-ALSA --without-vm-sound-OSS > --without-vm-display-fbdev --without-npsqueak CC="gcc -m32" > CFLAGS="-g -O2 -msse2 -D_GNU_SOURCE -DNDEBUG -DITIMER_HEARTBEAT=1 -DNO_VM_PROFILE=1 > -DCOGMTVM=0" LIBS=-lpthread > > make install prefix=/home1/vawhigso/public_html/squeakelib/Cog/Cog > > I find it best to delete the entire unixbuild directory and untar it fresh > from the tarball. > > Thanks, > Rob > > Error: > > /bin/sh > /home1/vawhigso/public_html/squeakelib/Cog/unixbuild/bld/libtool --mode=compile > gcc -m32 -g -O2 -msse2 -D_GNU_SOURCE -DNDEBUG -DITIMER_HEARTBEAT=1 -DNO_VM_PROFILE=1 > -DCOGMTVM=0 -msse -DLSB_FIRST=1 -DHAVE_CONFIG_H -I/home1/vawhigso/public_html/squeakelib/Cog/unixbuild/bld > -I/home1/vawhigso/public_html/squeakelib/Cog/unixbuild/bld -I/home1/vawhigso/public_html/squeakelib/Cog/platforms/unix/vm > -I/home1/vawhigso/public_html/squeakelib/Cog/platforms/Cross/vm -I/home1/vawhigso/public_html/squeakelib/Cog/src/vm > -c -o sqUnixSoundALSA.lo > /home1/vawhigso/public_html/squeakelib/Cog/platforms/unix/vm-sound-ALSA/sqUnixSoundALSA.c > gcc -m32 -g -O2 -msse2 -D_GNU_SOURCE -DNDEBUG -DITIMER_HEARTBEAT=1 -DNO_VM_PROFILE=1 > -DCOGMTVM=0 -msse -DLSB_FIRST=1 -DHAVE_CONFIG_H -I/home1/vawhigso/public_html/squeakelib/Cog/unixbuild/bld > -I/home1/vawhigso/public_html/squeakelib/Cog/unixbuild/bld -I/home1/vawhigso/public_html/squeakelib/Cog/platforms/unix/vm > -I/home1/vawhigso/public_html/squeakelib/Cog/platforms/Cross/vm -I/home1/vawhigso/public_html/squeakelib/Cog/src/vm > -c > /home1/vawhigso/public_html/squeakelib/Cog/platforms/unix/vm-sound-ALSA/sqUnixSoundALSA.c > -fPIC -DPIC -DPIC -o sqUnixSoundALSA.o > mv -f sqUnixSoundALSA.o sqUnixSoundALSA.lo > /bin/sh > /home1/vawhigso/public_html/squeakelib/Cog/unixbuild/bld/libtool --mode=link > gcc -m32 -g -O2 -msse2 -D_GNU_SOURCE -DNDEBUG -DITIMER_HEARTBEAT=1 -DNO_VM_PROFILE=1 > -DCOGMTVM=0 -msse -DLSB_FIRST=1 -avoid-version -module -rpath > /home1/vawhigso/public_html/squeakelib/Cog/Cog/lib/squeak/3.9-7 -o > vm-sound-ALSA.la sqUnixSoundALSA.lo -lasound > mkdir .libs > rm -fr .libs/vm-sound-ALSA.la .libs/vm-sound-ALSA.* .libs/vm-sound-ALSA.* > (cd . && ln -s sqUnixSoundALSA.lo sqUnixSoundALSA.o) > gcc -m32 -shared > sqUnixSoundALSA.lo -lasound -Wl,-soname -Wl,vm-sound-ALSA -o > .libs/vm-sound-ALSA > /usr/bin/ld: skipping incompatible /usr/lib64/libasound.so when searching > for -lasound > /usr/bin/ld: skipping incompatible /usr/lib64/libasound.a when searching > for -lasound > /usr/bin/ld: cannot find -lasound > collect2: ld returned 1 exit status > make[1]: *** [vm-sound-ALSA.la] Error 1 > make: *** [vm-sound-ALSA.la] Error 2 > > > > From: Eliot Miranda > Sent: Saturday, July 17, 2010 2:00 PM > To: Squeak Virtual Machine Development Discussion > Subject: Re: [Vm-dev] Cog on linux > > > > > > > > > On Fri, Jul 16, 2010 at 4:37 PM, Rob Withers <[hidden email]> wrote: > > > > > -------------------------------------------------- > From: "Rob Withers" <[hidden email]> > Sent: Friday, July 16, 2010 7:08 PM > > To: "Squeak Virtual Machine Development Discussion" > <[hidden email]> > Subject: Re: [Vm-dev] Cog on linux > > > > > > > > > > > > -------------------------------------------------- > From: "Levente Uzonyi" <[hidden email]> > > Sent: Friday, July 16, 2010 6:44 PM > > To: "Squeak Virtual Machine Development Discussion" > <[hidden email]> > Subject: Re: [Vm-dev] Cog on linux > > > > On Fri, 16 Jul 2010, Rob Withers wrote: > > > > > > -------------------------------------------------- > From: "Casey Ransberger" <[hidden email]> > Sent: Friday, July 16, 2010 5:10 PM > To: "Squeak Virtual Machine Development Discussion" > <[hidden email]> > Subject: Re: [Vm-dev] Cog on linux > > > > I think you just need libasound > > > > Ok guys, I am working on it with my webhost. I am not sure how I can > install an RPM on their box, so I submitted a ticket. > > > Actually I don't think that any sound plugin would work, because the > 32-bit binaries will not be available. But you don't need sound at all on > a server, do you? > > > > > You lost me on 32-bit binaries not being available. I found > alsa-lib-1.0.13.tar.bz2 and I am preparing to build it. > > > My error said: > /usr/bin/ld: skipping incompatible /usr/lib64/libasound.so when searching > for -lasound > /usr/bin/ld: skipping incompatible /usr/lib64/libasound.a when searching > for -lasound > > > So it looks like I need a .so and a .a file. As rusty as I am on building > for unix, are these static libs or shared libs? I do need to figure out > how > to cross compile to a 32bit system. I am researching this for the right > target. I need i386-linux or something. I am attaching the config.guess > file they use for this. Can anyone give me a helpful hand what I should > specify doing ... > > './configure --enable-shared=no --enable-static=yes --target=i386-linux' > > > > Damn! I can't figure out how to change the install path to other than > /usr/include, etc. I can't write to those directories. I am stuck. > > > Regarding your observation that I don't need sound on a server, that is > spot > on. It may be too entangled for the time being. > > > Is it possible to unentangle sound? > > > > Try configuring Cog without libalsa. e.g. > > > ../../platforms/unix/config/configure --without-vm-sound-ALSA --without-vm-display-fbdev > --without-npsqueak -prefix=/home/qwaq/qwaqvm/ > CFLAGS="-g -O1 -msse2 -D_GNU_SOURCE -DITIMER_HEARTBEAT=1 -DNO_VM_PROFILE=1 > -DCOGMTVM=0" LIBS=-lpthread > > and if you have problems with OSS sound you can disable that too > > > ../../platforms/unix/config/configure --without-vm-sound-ALSA --without-vm-sound-OSS > --without-vm-display-fbdev --without-npsqueak -prefix=/home/qwaq/qwaqvm/ > CFLAGS="-g -O1 -msse2 -D_GNU_SOURCE -DITIMER_HEARTBEAT=1 -DNO_VM_PROFILE=1 > -DCOGMTVM=0" LIBS=-lpthread > > > that will leave you with vm-sound-null which is fine for a server. > > Thanks, > Rob > > > > Cheers, > Rob > > > > Levente > > > > Thanks, > Rob |
In reply to this post by Rob Withers
Rob, my understanding is that if you configure without ALSA it should not attempt to compile sqUnixSoundALSA.c. So something must be screwed up. Try make reallyclean, then repeat the configure carefully. The configure should not produce a vm-sound-ALSA directory, and there should be no mention of vm-sound-ALSA in the generated Makefile. If it does you're either going to have to debug configure or manually edit the generated Makefile to remove mention of vm-sound-ALSA.
cheers Eliot
On Sat, Jul 17, 2010 at 11:18 AM, Rob Withers <[hidden email]> wrote:
|
SUCCESS!!
configure did generate vm-sound-ALSA target and
mkdir. I deleted the directory and the two entries for it in the root
Makefile. Ran make install. It compiled like a champ and
is now running as a daemon.
Thanks for all the help
Eliot/everyone!
Rob
From: [hidden email]
Sent: Saturday, July 17, 2010 3:14 PM
To: [hidden email]
Subject: Re: [Vm-dev] Cog on linux
Rob, my understanding is that if you configure without ALSA
it should not attempt to compile sqUnixSoundALSA.c. So something must be
screwed up. Try make reallyclean, then repeat the configure carefully. The
configure should not produce a vm-sound-ALSA directory, and there should be no
mention of vm-sound-ALSA in the generated Makefile. If it does you're
either going to have to debug configure or manually edit the generated Makefile
to remove mention of vm-sound-ALSA.
cheers
Eliot
On Sat, Jul 17, 2010 at 11:18 AM, Rob Withers <[hidden email]> wrote:
|
In reply to this post by Eliot Miranda-2
On Thu, 15 Jul 2010, Eliot Miranda wrote: (Pine can't quote your mail, because the first part is empty.) " can you edit unixbuild/HowToBuild with this info? Err on the expansive side and feel free to include a good URL if you know of one." This is what I came up with: http://leves.web.elte.hu/squeak/HowToBuild Levente |
Thanks, Levente. Its been integrated. I added CXX="g++ -m32" to the instructions too as there's potentially the Bochs plugin which is c++. 2010/7/21 Levente Uzonyi <[hidden email]>
|
In reply to this post by Rob Withers
Hi Rob, In case you never got a CogVM compiled: I was able to compile to 32bit VM on 64 bit ubuntu 10.04 desktop after installing these packages: lib32asound2-dev libgl1-mesa-dev libglu1-mesa-dev build-essential ia32-libs gcc-multilib g++multilib Then follow the instructions here: http://www.squeakvm.org/svn/squeak/branches/Cog/unixbuild/HowToBuild This was the configure incantation: ../../platforms/unix/config/configure CC="gcc -m32" CXX="g++ -m32" CFLAGS="-g -O2 -msse2 -D_GNU_SOURCE -DNDEBUG -DITIMER_HEARTBEAT=1 -DNO_VM_PROFILE=1 -DCOGMTVM=0" LIBS=-lpthread Then do this: http://forum.world.st/CogVM-td2262856.html#a2262856 But. The VM will open your image, but it seg faults a short, random amount of time after that. Even if you make the change Eliot recommends here: http://forum.world.st/Cog-segfaults-on-linux-td2289607.html#a2289607 Good luck. Paul |
On Thu, Jul 22, 2010 at 1:10 PM, Paul DeBruicker <[hidden email]> wrote:
Thanks, Paul. I've added this to the HowToBuild. Then follow the instructions here: Have you updated to r2244 or better? As I hoped in the log:
svn log platforms/unix/vm/sqUnixHeartbeat.c r2244 | eliot | 2010-07-20 11:20:26 -0700 (Tue, 20 Jul 2010) | 3 lines
Fix heartbeat clock log (-ve % +ve => -ve bounds violation). This will hopefully fix crashes in the heartbeat under linux.
If you're already at 2244 then what's the backtrace in gdb, registers etc? Good luck. |
In reply to this post by Rob Withers
> > Have you updated to r2244 or better? As I hoped in the log: > > svn log platforms/unix/vm/sqUnixHeartbeat.c > r2244 | eliot | 2010-07-20 11:20:26 -0700 (Tue, 20 Jul 2010) | 3 lines > > Fix heartbeat clock log (-ve % +ve => -ve bounds violation). > This will hopefully fix crashes in the heartbeat under linux. > > If you're already at 2244 then what's the backtrace in gdb, registers etc? > > src, and unixbuild. You'll have to be explicit for what you want from gdb as I'm a complete novice. Here's the gdb session: paul@paul-laptop:~/src/squeakvm/unixbuild/bld$ gdb ./squeak GNU gdb (GDB) 7.1-ubuntu Copyright (C) 2010 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> This is free software: you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. Type "show copying" and "show warranty" for details. This GDB was configured as "x86_64-linux-gnu". For bug reporting instructions, please see: <http://www.gnu.org/software/gdb/bugs/>... Reading symbols from /home/paul/src/squeakvm/unixbuild/bld/squeak...done. (gdb) run Pharo-1.1-11409-rc4dev10.07.2.image Starting program: /home/paul/src/squeakvm/unixbuild/bld/squeak Pharo-1.1-11409-rc4dev10.07.2.image warning: the debug information found in "/lib/ld-2.11.1.so" does not match "/lib/ld-linux.so.2" (CRC mismatch). [Thread debugging using libthread_db enabled] [New Thread 0xb7adbb70 (LWP 27073)] Program received signal SIGUSR2, User defined signal 2. [Switching to Thread 0xb7adbb70 (LWP 27073)] heartbeat_handler (sig=14, sig_info=0x63, context=0x0) at /home/paul/src/squeakvm/platforms/unix/vm/sqUnixHeartbeat.c:461 461 { (gdb) bt #0 heartbeat_handler (sig=14, sig_info=0x63, context=0x0) at /home/paul/src/squeakvm/platforms/unix/vm/sqUnixHeartbeat.c:461 #1 <signal handler called> #2 0xf7fdf430 in __kernel_vsyscall () #3 0xf7fabb16 in nanosleep () from /lib32/libpthread.so.0 #4 0x0805fa38 in tickerSleepCycle (ignored=0x0) at /home/paul/src/squeakvm/platforms/unix/vm/sqUnixHeartbeat.c:375 #5 0xf7fa396e in start_thread () from /lib32/libpthread.so.0 #6 0xf7ed6b5e in clone () from /lib32/libc.so.6 (gdb) info registers eax 0xe 14 ecx 0x0 0 edx 0x0 0 ebx 0xb7adb388 -1213353080 esp 0xb7adadfc 0xb7adadfc ebp 0xb7adb398 0xb7adb398 esi 0xb7adbb70 -1213351056 edi 0x3d0f00 4001536 eip 0x805f6d0 0x805f6d0 <heartbeat_handler> eflags 0x296 [ PF AF SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 (gdb) |
Hi Paul, On Thu, Jul 22, 2010 at 1:39 PM, Paul DeBruicker <[hidden email]> wrote:
OK, frst create .gdbinit in your home directory with the following contents ---------8<------- ~/.gdbinit ---------8<-------
set history save on handle SIGUSR1 nostop noprint noignore handle SIGUSR2 nostop noprint noignore handle SIGALRM nostop noprint noignore handle SIGPOLL nostop noprint noignore
handle SIGPIPE nostop noprint noignore ---------------------------8<--------------------- then run as usual and when the segfault happens type (gdb) where (gdb) info registers
(gdb) x/5i $eip (gdb) info threads and post the results. where gives a stack backtrace. info registers prints the registers. x/5i $eip prints the faulting instruction and the 4 following it. If its very close to the start of the function containing the segfault you can also say disass func and get the code surroundig the fault. info threads prints how many threads there are. If you want you can say thread N and then where to get the stack backtrace for each thread.
cheers Eliot
|
In reply to this post by Rob Withers
Hi Eliot, On 07/22/2010 05:02 PM, [hidden email] wrote: > handle SIGUSR2 nostop noprint noignore > If I include the above line in my .gdbinit then gdb complains: Cannot find user-level thread for LWP XXXXX where XXXXX is the process number for the VM. Sometimes the VM window stays open and freezes at that point and sometimes it closes. Gdb then states that the "Target is running" when I type in the commands you listed. If I comment the "handle SIGUSR2 ..." line out then I get this from those commands: (gdb) where #0 0xf7fdf430 in __kernel_vsyscall () #1 0xf7fabb16 in nanosleep () from /lib32/libpthread.so.0 #2 0x0805fa38 in tickerSleepCycle (ignored=0x0) at /home/paul/src/squeakvm/platforms/unix/vm/sqUnixHeartbeat.c:375 #3 0xf7fa396e in start_thread () from /lib32/libpthread.so.0 #4 0xf7ed6b5e in clone () from /lib32/libc.so.6 (gdb) info registers eax 0xfffffdfc -516 ecx 0x0 0 edx 0xb7adb388 -1213353080 ebx 0xb7adb388 -1213353080 esp 0xb7adb358 0xb7adb358 ebp 0xb7adb398 0xb7adb398 esi 0xb7adbb70 -1213351056 edi 0x3d0f00 4001536 eip 0xf7fdf430 0xf7fdf430 <__kernel_vsyscall+16> eflags 0x296 [ PF AF SF IF ] cs 0x23 35 ss 0x2b 43 ds 0x2b 43 es 0x2b 43 fs 0x0 0 gs 0x63 99 (gdb) x/5i $eip => 0xf7fdf430 <__kernel_vsyscall+16>: pop %ebp 0xf7fdf431 <__kernel_vsyscall+17>: pop %edx 0xf7fdf432 <__kernel_vsyscall+18>: pop %ecx 0xf7fdf433 <__kernel_vsyscall+19>: ret 0xf7fdf434: add %ch,(%esi) (gdb) info threads * 2 Thread 0xb7adbb70 (LWP 27239) 0xf7fdf430 in __kernel_vsyscall () 1 Thread 0xf7e056c0 (LWP 27236) heartbeat_handler (sig=14, sig_info=0x63, context=0x0) at /home/paul/src/squeakvm/platforms/unix/vm/sqUnixHeartbeat.c:461 (gdb) thread 1 [Switching to thread 1 (Thread 0xf7e056c0 (LWP 27236))]#0 heartbeat_handler (sig=14, sig_info=0x63, context=0x0) at /home/paul/src/squeakvm/platforms/unix/vm/sqUnixHeartbeat.c:461 461 { (gdb) bt #0 heartbeat_handler (sig=14, sig_info=0x63, context=0x0) at /home/paul/src/squeakvm/platforms/unix/vm/sqUnixHeartbeat.c:461 #1 <signal handler called> #2 0xf7feefe0 in _dl_debug_state () from /lib/ld-linux.so.2 #3 0xf7ff272c in ?? () from /lib/ld-linux.so.2 #4 0xf7fee2f6 in ?? () from /lib/ld-linux.so.2 #5 0xf7ff2106 in ?? () from /lib/ld-linux.so.2 #6 0xf7fb7c0b in ?? () from /lib32/libdl.so.2 #7 0xf7fee2f6 in ?? () from /lib/ld-linux.so.2 #8 0xf7fb809c in ?? () from /lib32/libdl.so.2 #9 0xf7fb7b41 in dlopen () from /lib32/libdl.so.2 #10 0xf7c01b27 in ?? () from /usr/lib32/libX11.so.6 #11 0xf7c01fe7 in _XNoticeCreateBitmap () from /usr/lib32/libX11.so.6 #12 0xf7c0220d in XCreatePixmap () from /usr/lib32/libX11.so.6 #13 0xf7c010e2 in XCreateBitmapFromData () from /usr/lib32/libX11.so.6 #14 0xf7db70bb in display_ioSetCursorWithMask (cursorBitsIndex=-1210764836, cursorMaskIndex=<value optimized out>, offsetX=-1, offsetY=-1) at /home/paul/src/squeakvm/platforms/unix/vm-display-X11/sqUnixX11.c:3855 #15 0x08071422 in primitiveBeCursor () at /home/paul/src/squeakvm/src/vm/gcc3x-cointerp.c:23540 #16 0x0807f443 in interpret () at /home/paul/src/squeakvm/src/vm/gcc3x-cointerp.c:4872 #17 0x0807eeec in enterSmalltalkExecutiveImplementation () at /home/paul/src/squeakvm/src/vm/gcc3x-cointerp.c:14771 #18 0x0807f118 in initStackPagesAndInterpret () at /home/paul/src/squeakvm/src/vm/gcc3x-cointerp.c:18367 #19 0x0805eed3 in main (argc=2, argv=0xffffcda4, envp=0xffffcdb0) at /home/paul/src/squeakvm/platforms/unix/vm/sqUnixMain.c:1627 (gdb) thread 2 [Switching to thread 2 (Thread 0xb7adbb70 (LWP 27239))]#0 0xf7fdf430 in __kernel_vsyscall () (gdb) bt #0 0xf7fdf430 in __kernel_vsyscall () #1 0xf7fabb16 in nanosleep () from /lib32/libpthread.so.0 #2 0x0805fa38 in tickerSleepCycle (ignored=0x0) at /home/paul/src/squeakvm/platforms/unix/vm/sqUnixHeartbeat.c:375 #3 0xf7fa396e in start_thread () from /lib32/libpthread.so.0 #4 0xf7ed6b5e in clone () from /lib32/libc.so.6 |
Hi Paul, On Thu, Jul 22, 2010 at 2:14 PM, Paul DeBruicker <[hidden email]> wrote:
just looks like the OS/run-time is not letting the program set a handler for SIGUSR2 and/or not allowing it to be caught. This is a deal breaker. Why it's happening I don't know, but currently Cog's heartbeat on linux depends on being able to catch SIGUSR2.
HTH Eliot
|
On 22/07/10 22:20, Eliot Miranda wrote: > > > > > Hi Paul, > > On Thu, Jul 22, 2010 at 2:14 PM, Paul DeBruicker<[hidden email]> wrote: > >> >> Hi Eliot, >> >> >> >> On 07/22/2010 05:02 PM, [hidden email] wrote: >> >>> handle SIGUSR2 nostop noprint noignore >>> >>> >> If I include the above line in my .gdbinit then gdb complains: >> >> Cannot find user-level thread for LWP XXXXX >> >> where XXXXX is the process number for the VM. Sometimes the VM window >> stays open and freezes at that point and sometimes it closes. Gdb then >> states that the "Target is running" when I type in the commands you listed. >> If I comment the "handle SIGUSR2 ..." line out then I get this from those >> commands: >> > > just looks like the OS/run-time is not letting the program set a handler for > SIGUSR2 and/or not allowing it to be caught. This is a deal breaker. Why > it's happening I don't know, but currently Cog's heartbeat on linux depends > on being able to catch SIGUSR2. From: http://pauillac.inria.fr/~xleroy/linuxthreads/faq.html H.4: With LinuxThreads, I can no longer use the signals SIGUSR1 and SIGUSR2 in my programs! Why? The short answer is: because the Linux kernel you're using does not support realtime signals. |
In reply to this post by Rob Withers
Hi Eliot, > just looks like the OS/run-time is not letting the program set a handler for > SIGUSR2 and/or not allowing it to be caught. This is a deal breaker. Why > it's happening I don't know, but currently Cog's heartbeat on linux depends > on being able to catch SIGUSR2. > > HTH > Eliot > From reading this: http://manpages.ubuntu.com/manpages/lucid/man7/signal.7.html it looks like Ubuntu defaults to terminate the program on SIGUSR2. Would it be appropriate to use sigaction() somewhere in the VM code to override the OS's default setting for SIGUSR2? I think you could change it from terminate to ignore. http://opengroup.org/onlinepubs/007908775/xsh/sigaction.html For SIGUSR2 the int sig value is 12 Or do you know if its an environment setting I can change? Thanks for your help Paul |
In reply to this post by Derek O'Connell-2
Hi Derek, On Thu, Jul 22, 2010 at 2:25 PM, Derek O'Connell <[hidden email]> wrote:
I'd forgotten all that! I thought that stuff was ancient history. So we need two things, one is a pair of alternative signals, the other is a reliable #define that we can use to distinguish l'ancien regime from the modern day.
thanks Derek! best Eliot
|
In reply to this post by Paul DeBruicker
On Thu, Jul 22, 2010 at 2:42 PM, Paul DeBruicker <[hidden email]> wrote:
That's exactly what the code does. Look at uses of TICKER_SIGNAL in platforms/unix/vm/sqUnixHeartbeat.c. WJat Derek points out is that on certain (I thought long gone) LinuxThreads implementations SIGUSR1 & SIGUSR2 are reserved by the threads implementation and used internally. One cant use them in these contexts. So that means a) finding out at compile time whether we're in this regime or not and b) choosing some alternative signals (one for the ticker (USR2), one for dumping all stacks (USR2)).
best Eliot I think you could change it from terminate to ignore. |
Hi All, for those of you trying to get Cog working on linux I should say something about the state of the linux port and what to watch out for as you try and get the system up and running. The Cog VMs depend on a heartbeat to periodically interrupt execution and cause the system to poll for input. The default implementation is a high-priority thread that loops sleeping for a short time and then interrupting the VM (by setting the VM's stackLimit to a value that will cause the next frame-building send to check for stack overflow, which as a side effect also checks for input). The default heartbeat frequency is either 500Hz (unix) or 1KHz (win32). The heartbeat also updates the system's clock on each beat, since accessing the clock can be quite expensive and so updating at regular intervals actually provides a cheaper clock with acceptable resolution.
The threaded heartbeat implementation depends on having multiple thread priorities. If the heartbeat thread runs at the same priority as the VM thread then if the VM thread becomes compute intensive the heartbeat will be starved of cycles. The VM won't see input and the clock won't update.
The current state of posix threads on RedHat-derived linux distros is that multiple thread priorities are only available to processes running with superuser privileges, and so not practically available to the VM. Hence the unix heartbeat (platforms/unix/vm/sqUnixHeartbeat.c) provides a fallback selected by defining ITIMER_HEARTBEAT=1 at compile time. This avoids the use of a thread and falls back to the ITIMER_REAL interval timer (setitimer(2)). This isn't ideal; we would like to use the heartbeat to run other periodic high-priority activities, but with some fiddling it works.
At Teleplace we have some quite sophisticated media processing code that is run for the heartbeat thread, except that on linux there isn't one. So on linux there is a second thread dedicated to these activities (see platforms/Cross/vm/sqTicker.c for the facilities that allow one to install periodic high-priority function calls) and a combination of forcing the VM thread to block and sending SIGUSR2 to the "high-priority" thread (to break it out of a blocking sleep) simulates a high-priority thread preempting the VM thread, at least if you cross you fingers and its Wednesday.
But for linux users out there I need to emphasize that these shenanigans are only necessary if the thread system doesn't support multiple priorities for user-level processes. If the thread system /does/ 't support multiple priorities then the correct solution is to compile without defining ITIMER_HEARTBEAT=1. I think you'll find that if you compile without ITIMER_HEARTBEAT the VM will complain if the thread system doesn't allow it to set the heartbeat thread's priority above the VM thread's priority. So those of you on LinuxThreads where SIGUSR1 & SIGUSR2 are used internally need to see whether the system does support multiple priorities for user processes and if so a) abandon the ITIMER_HEARTBEAT, b) choose a different signal to SIGUSR1 for provoking a printAllStacks report, and c) report back what system you're on and if possible how to distinguish that pthreads variant from the pthread.h header.
If on the other hand the system doesn't support multiple thread priorities and still uses SIGUSR1 & SIGUSR2 internally (nice user-centric choice ;) ) then you need retain ITIMER_HEARTBEAT=1 and to find alternative signal numbers for the prodding of the "high-priority" thread and the printAllStacks report.
HTH Eliot |
In reply to this post by Eliot Miranda-2
Hi Eliot, On 22/07/10 22:44, Eliot Miranda wrote: >>>> >>> just looks like the OS/run-time is not letting the program set a >>> handler for SIGUSR2 and/or not allowing it to be caught. This is >>> a deal breaker. Why it's happening I don't know, but currently >>> Cog's heartbeat on linux depends on being able to catch SIGUSR2. >>> >> >> From: http://pauillac.inria.fr/~xleroy/linuxthreads/faq.html >> >> H.4: With LinuxThreads, I can no longer use the signals SIGUSR1 and >> SIGUSR2 in my programs! Why? >> >> The short answer is: because the Linux kernel you're using does not >> support realtime signals. >> > > I'd forgotten all that! I thought that stuff was ancient history. > So we need two things, one is a pair of alternative signals, the > other is a reliable #define that we can use to distinguish l'ancien > regime from the modern day. Something smells fishy about signals specifically reserved for user app's then being re-reserved for something else and since the page I linked to begins with the warning "This FAQ has not been updated for a while and may not be 100% up to date" I am trying to clarify the situation. Still at it but here is what I have dug up so far: - "LinuxThreads" generally refers to threading on pre-2.6 kernels - "NPTL", Native POSIX Threads Library for Linux, replaces "LinuxThreads" on 2.5+ kernels (publicly 2.6+) - "NGPT", IBM's Next Generation POSIX Threads, for 2.4 kernels and earlier, works/worked in conjunction with LinuxThreads - RedHat back-ported NTPL to pre-2.6 kernels and made the threading model selectable between NTPL/LinuxThreads on a per process basis To determine the threading library that a system uses (example shown for my system: Ubuntu 9.10, 2.6.31-22-generic #60-Ubuntu SMP Thu May 27 00:22:23 UTC 2010 i686 GNU/Linux): > getconf GNU_LIBPTHREAD_VERSION NPTL 2.10.1 So that fishy smell might come from a RedHat specific red-herring ;-) Most likely on modern distro's with a 2.6+ kernel: A) NTPL *is* being used B) SIGUSR1/2 are *not* reserved C) SIGUSR1/2, if they are indeed the source of any problems, may be getting used elsewhere (in a plugin perhaps) More on this below but first I would throw into the mix what in my limited experience has sometimes been the source of odd problems. This is the handling of EINTR/EAGAIN errors and, AFAICT, the increased likelihood of these errors occurring depending on how busy a process and/or the system in general is. I have seen code that has been written pre-2.6 which has worked well even post-2.6 until system load increases and/or used in multi-threading. Some IOCtrl() calls then fail but since the immediate code does not handle EINTR/EAGAIN the result is some obscure error at a higher level. Like Paul and Rob I got the VM compiled last weekend but it would either crash immediately or after maybe 15/20s. Then I crashed when I got to the stage of trying to debug a multi-threaded application :-) I'm admittedly largely ignorant of how Cog changes the VM and apologise if my post re SIGUSR1/2 does prove to be a red-herring but I'm still wondering about the need for multi-threading for non-Teleplace use? From your latest email it seems as if multi-threading has been introduced to support high-priority for Teleplace "media processing". If this is functionality that will remain private to Teleplace and there is no clear benefit to others for a core VM high-priority thread, and given the difficulties debugging, then could public Cog be single threaded? -D |
Free forum by Nabble | Edit this page |