Hi guys,
I am start seeing a weird gem crash that I haven't seem before. I am pasting it here. I am using GemStone 3.1.0.6 with seaside and native code enabled in Linux CentOS 7. Thanks in advance for any tip, Thu Jun 25 13:54:15 EDT 2015 gdb is /bin/gdb ===--- start gdb stacks Thu Jun 25 13:54:16 EDT 2015 [New LWP 23704] [New LWP 23699] [New LWP 23698] [Thread debugging using libthread_db enabled] Using host libthread_db library "/lib64/libthread_db.so.1". 0x00007f2fb9c91bdd in poll () from /lib64/libc.so.6 Thread 4 (Thread 0x7f2f3d4dc700 (LWP 23698)): #0 0x00007f2fbafd45eb in recv () from /lib64/libpthread.so.0 #1 0x00007f2fb9217d31 in SocketRead (sock=7, dataPtr=0x7f2f3d4dbedf "\002L\377~\262/\177", dataSize=1, peek=<optimized out>, numRead=0x7f2f3d4dbed0, notDone=0x0, interrupted=0x7f2f3d4dbed4, err=0x7f2f3d4db6c0) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/socket.c:2601 #2 0x00007f2fb92011f9 in stnOobReaderThreadFn (arg=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/stncall.c:553 #3 0x00007f2fbafcddf3 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f2fb9c9c1ad in clone () from /lib64/libc.so.6 Thread 3 (Thread 0x7f2f08d57700 (LWP 23699)): #0 0x00007f2fbafd45eb in recv () from /lib64/libpthread.so.0 #1 0x00007f2fb9217d31 in SocketRead (sock=9, dataPtr=0x7f2f08d56ed0 "", dataSize=4, peek=<optimized out>, numRead=0x7f2f08d56ecc, notDone=0x0, interrupted=0x7f2f08d56ec8, err=0x7f2f08d566b0) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/socket.c:2601 #2 0x00007f2fb92014b7 in shrpcmonSocketReadFn (arg=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/stncall.c:403 #3 0x00007f2fbafcddf3 in start_thread () from /lib64/libpthread.so.0 #4 0x00007f2fb9c9c1ad in clone () from /lib64/libc.so.6 Thread 2 (Thread 0x7f2f0822f700 (LWP 23704)): #0 0x00007f2fb9c91bdd in poll () from /lib64/libc.so.6 #1 0x00007f2fb914b790 in timeoutThreadFn (arg=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/socketprim.c:3207 #2 0x00007f2fbafcddf3 in start_thread () from /lib64/libpthread.so.0 #3 0x00007f2fb9c9c1ad in clone () from /lib64/libc.so.6 Thread 1 (Thread 0x7f2fbb3ef740 (LWP 23690)): #0 0x00007f2fb9c91bdd in poll () from /lib64/libc.so.6 #1 0x00007f2fb91dc943 in HostMilliSleep (milliseconds=200, exitIfInterrupted=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostunixmt.c:649 #2 0x00007f2fb91db49a in forkAndWait (cmdPath=0x7ffc7bf70330 "/opt/gemstone/GemStone64Bit3.1.0.6-x86_64.Linux/bin/pstack", args=0x7ffc7bf72440) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostdebugmt.c:66 #3 0x00007f2fb91db679 in HostPrintCStackForPid (pid=23690) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostdebugmt.c:393 #4 0x00007f2fb91db6ca in HostPrintCStack () at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostdebugmt.c:353 #5 0x00007f2fb92200dd in HostFaultHandler (sig=11, info=0x7ffc7bf72830, context=0x7ffc7bf72700) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostunix.c:1160 #6 <signal handler called> #7 om::GsSocketDoPoll192 (omPtr=0x7f2f3d4de000, ARStackPtr=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/om.hf:4318 #8 0x00007f2fab2b035b in ?? () #9 0x0000000000000000 in ?? () Thread 4 (Thread 0x7f2f3d4dc700 (LWP 23698)): #0 0x00007f2fbafd45eb in recv () from /lib64/libpthread.so.0 No symbol table info available. #1 0x00007f2fb9217d31 in SocketRead (sock=7, dataPtr=0x7f2f3d4dbedf "\002L\377~\262/\177", dataSize=1, peek=<optimized out>, numRead=0x7f2f3d4dbed0, notDone=0x0, interrupted=0x7f2f3d4dbed4, err=0x7f2f3d4db6c0) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/socket.c:2601 success = <optimized out> continueIfInterrupted = 0 callInterrupted = <optimized out> flags = 0 errNum = 1 #2 0x00007f2fb92011f9 in stnOobReaderThreadFn (arg=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/stncall.c:553 interrupted = 0 socketBuf = <optimized out> numRead = 1 success = 1 wks = 0x7f2f3d4ee2a8 thr = 0x7f2f3d4ee350 sessionPtr = 0x7f2f3d4ef840 oobSocket = 7 sErr = {errCode = SYSERR_NONE, errNum = 0, eaiError = 0, categ = SYSERRCAT_ERRNO, errMsg = '\000' <repeats 1023 times>} status = 0 stopRequestVal = 0 doCloseSocket = <optimized out> #3 0x00007f2fbafcddf3 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #4 0x00007f2fb9c9c1ad in clone () from /lib64/libc.so.6 No symbol table info available. Thread 3 (Thread 0x7f2f08d57700 (LWP 23699)): #0 0x00007f2fbafd45eb in recv () from /lib64/libpthread.so.0 No symbol table info available. #1 0x00007f2fb9217d31 in SocketRead (sock=9, dataPtr=0x7f2f08d56ed0 "", dataSize=4, peek=<optimized out>, numRead=0x7f2f08d56ecc, notDone=0x0, interrupted=0x7f2f08d56ec8, err=0x7f2f08d566b0) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/socket.c:2601 success = <optimized out> continueIfInterrupted = 0 callInterrupted = <optimized out> flags = 0 errNum = 148207360 #2 0x00007f2fb92014b7 in shrpcmonSocketReadFn (arg=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/stncall.c:403 success = 148207360 wks = 0x7f2f3d4ee2a8 thr = 0x7f2f3d4ee388 ibSocket = 9 sErr = {errCode = SYSERR_NONE, errNum = 0, eaiError = 0, categ = SYSERRCAT_ERRNO, errMsg = '\000' <repeats 1023 times>} status = 0 socketBuf = "\000\000\000" numRead = 0 interrupted = 0 stopRequestVal = 0 doCloseSocket = <optimized out> #3 0x00007f2fbafcddf3 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #4 0x00007f2fb9c9c1ad in clone () from /lib64/libc.so.6 No symbol table info available. Thread 2 (Thread 0x7f2f0822f700 (LWP 23704)): #0 0x00007f2fb9c91bdd in poll () from /lib64/libc.so.6 No symbol table info available. #1 0x00007f2fb914b790 in timeoutThreadFn (arg=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/socketprim.c:3207 nowUsec = 139842957155293 numFound = <optimized out> commandSock = 11 omPtr = 0x7f2f3d4de000 rwks = 0x7f2f3d4ee2a8 sErr = {errCode = SYSERR_NONE, errNum = 0, eaiError = 0, categ = SYSERRCAT_ERRNO, errMsg = '\000' <repeats 1023 times>} pollArr = {{fd = 11, events = 1, revents = 0}} wakeupCmd = {signalTimeMs = 0, signalPriority = 0} msToWait = 29999 signalUsec = 1435254877152000 #2 0x00007f2fbafcddf3 in start_thread () from /lib64/libpthread.so.0 No symbol table info available. #3 0x00007f2fb9c9c1ad in clone () from /lib64/libc.so.6 No symbol table info available. Thread 1 (Thread 0x7f2fbb3ef740 (LWP 23690)): #0 0x00007f2fb9c91bdd in poll () from /lib64/libc.so.6 No symbol table info available. #1 0x00007f2fb91dc943 in HostMilliSleep (milliseconds=200, exitIfInterrupted=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostunixmt.c:649 No locals. #2 0x00007f2fb91db49a in forkAndWait (cmdPath=0x7ffc7bf70330 "/opt/gemstone/GemStone64Bit3.1.0.6-x86_64.Linux/bin/pstack", args=0x7ffc7bf72440) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostdebugmt.c:66 waitResult = 1 buf = "\000\000\367{\374\177\000\000\000\000\000\000\000\000\000\000\v\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001", '\000' <repeats 15 times>, "\f\205)\271/\177\000\000\300$\367{\374\177\000\000\377\377\377\377\000\000\000\000N\263\035\271/\177\000\000\332W\303\001\000\000\000\000@\364\365\271/\177\000\000\000\000\000\000\000\000\000\000\060\003\367{\374\177\000\000\000\020\000\000\000\000\000\000\060\023\367{\374\177\000\000\340W\303\001\000\000\000\000\341W\303\001\000\000\000\000\342W\303\001\000\000\000\000\343W\303\001\000\000\000\000\344W\303\001\000\000\000\000\345W\303\001\000\000\000\000\300$\367{\374\177\000\000`$\367{\374\177\000\000\000\000\000\000\000\000\000\000"... oldSignal = {__sigaction_handler = {sa_handler = 0x7f2fb921ea10 <HostUnixSigChildHandler(int, siginfo_t*, void*)>, sa_sigaction = 0x7f2fb921ea10 <HostUnixSigChildHandler(int, siginfo_t*, void*)>}, sa_mask = {__val = {18446744067266838271, 29579166, 29579167, 29579168, 29579169, 29579170, 29579171, 29579172, 29579173, 29579174, 29579175, 29579176, 139842979563702, 5, 0, 0}}, sa_flags = 335544327, sa_restorer = 0x7f2fbafd5130 <__restore_rt>} childStatus = 32764 child = 6427 #3 0x00007f2fb91db679 in HostPrintCStackForPid (pid=23690) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostdebugmt.c:393 argArray = {0x7ffc7bf71330 "pstack", 0x7ffc7bf72430 "23690", 0x0} pidStr = "23690\000\000\000\000\000\000\000\000\000\000" myPid = 23690 buffer = "\nBegin attempt to print C-level stack for process 23690 at: \000apno 0xe oldmask 0x0 cr2 0x26b8811 \n\000ad 0xe033 \n\000\000\000\220ZN=/\177\000\000 \000\000\000\060\000\000\000\200$\367{\374\177\000\000\300#\367{\374\177\000\000\v\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\001\000\000\000\000\000\000\000\021\210k\002\000\000\000\000\377\355\276\271/\177\000\000\034", '\000' <repeats 47 times>... cmdName = "pstack\000\000\205o\303\001\000\000\000\000\232o\303\001\000\000\000\000\233o\303\001\000\000\000\000\235o\303\001\000\000\000\000\244o\303\001\000\000\000\000\245o\303\001\000\000\000\000\266o\303\001\000\000\000\000\270o\303\001\000\000\000\000\307t\303\001\000\000\000\000\310t\303\001\000\000\000\000\311t\303\001\000\000\000\000\312t\303\001\000\000\000\000\313t\303\001\000\000\000\000\314t\303\001\000\000\000\000\315t\303\001\000\000\000\000\316t\303\001\000\000\000\000\317t\303\001\000\000\000\000\320t\303\001\000\000\000\000\321t\303\001\000\000\000\000\322t\303\001\000\000\000\000\323t\303\001\000\000\000\000\324t\303\001\000\000\000\000\325t\303\001\000\000\000\000\326t\303\001\000\000\000\000"... cmdNameWithPath = "/opt/gemstone/GemStone64Bit3.1.0.6-x86_64.Linux/bin/pstack\000\001\000\000\000\000\022X\303\001\000\000\000\000\023X\303\001\000\000\000\000\024X\303\001\000\000\000\000\025X\303\001\000\000\000\000\026X\303\001\000\000\000\000\027X\303\001\000\000\000\000\030X\303\001\000\000\000\000\031X\303\001\000\000\000\000\032X\303\001\000\000\000\000\033X\303\001\000\000\000\000\034X\303\001\000\000\000\000\035X\303\001\000\000\000\000\036X\303\001\000\000\000\000\037X\303\001\000\000\000\000 X\303\001\000\000\000\000!X\303\001\000\000\000\000\"X\303\001\000\000\000\000"... #4 0x00007f2fb91db6ca in HostPrintCStack () at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostdebugmt.c:353 No locals. #5 0x00007f2fb92200dd in HostFaultHandler (sig=11, info=0x7ffc7bf72830, context=0x7ffc7bf72700) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/hostunix.c:1160 mgr = <optimized out> sigName = <optimized out> inGcilnk = 1 savedHandler = <optimized out> foundSignal = 0 numSegvInProgress = <optimized out> msgBuf = "\n\nGemstone Signal Handler: Signal 11, SIGSEGV Received \n", '\000' <repeats 198 times> validArgs = <optimized out> #6 <signal handler called> No symbol table info available. #7 om::GsSocketDoPoll192 (omPtr=0x7f2f3d4de000, ARStackPtr=<optimized out>) at /export/jupiter3/users/buildgss/gs64/3105x/build33242/src/om.hf:4318 pollArrH = 0x7f2f401b2048 firstDeleteIdx = 1 pollIdx = 1 myScope = {_root = 0x7f2f3d4dedb0, parent = 0x7ffc7bf73530, basePtr = 0x7f2f401b2038} sockH = 0x7f2f401b2040 waitersH = <optimized out> aWaiterProcH = 0x7f2f401b2058 numToDelete = 1 pollArray = 0x7f2f63783a08 resArrH = 0x7f2f3d564db0 oMsToWait = <optimized out> waitOk = <optimized out> msToWait = <optimized out> alreadyTimedOut = <optimized out> timedOut = 0 sErr = {errCode = SYSERR_NONE, errNum = 0, eaiError = 0, categ = SYSERRCAT_ERRNO, errMsg = '\000' <repeats 144 times>, "\006\000\000\000\070\060\061\065MB", '\000' <repeats 142 times>...} evSet = 0x7f2f6374fd88 numEvents = 0 resIdx = 1 rc = <optimized out> rc = <optimized out> #8 0x00007f2fab2b035b in ?? () No symbol table info available. #9 0x0000000000000000 in ?? () No symbol table info available. $1 = 1 $2 = 1 $3 = {<text variable, no debug info>} 0x7f2fb90b6680 <IntLpBCLoop> ===--- end gdb stacks _______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Mariano,
These crashes appear to be due to some bugs in the process scheduler that have been fixed in 3.2.x... Dale On 06/25/2015 12:06 PM, Mariano
Martinez Peck via Glass wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
HI Dale, Is there a bug issue I can read about that bug so that I can see if it is that one? I haven't seen this before and now I start to see it quite frequently. It might be related to when I fire background jobs from seaside? I have something similar to service VM where basically from within seaside I call topaz and open another gem to process something (Otto shared this code with me). And so, to run these background jobs, I am doing this: executeClosureWithBackgroundPriority: aBlockClosure "Executes aBlockClosure with background priority and answers the result of the closure" | oldPriority lowerPriority | oldPriority := System priorityForSessionId: System session. lowerPriority := oldPriority - 1 max: 0. System setSessionPriority: lowerPriority forSessionId: System session. ^ [ aBlockClosure value ] ensure: [ System setSessionPriority: oldPriority forSessionId: System session ]. Could this be related? Because I started to use this recently... Thanks in advance, On Thu, Jun 25, 2015 at 6:53 PM, Dale Henrichs via Glass <[hidden email]> wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Mariano,
I don't know the specific bug numbers but the two places to look would be the Release Notes and bug Notes for the 3.2.x [1] and 3.1.x[2] releases. It is worth noting that we have recently added a site search field that can be found on the bottom of each of the pages on the GemTalk Systems site or here[3]. I ran a search on `scheduler` and had a few hits[4], but frankly it is possible that we fixed bugs without recording specific bug reports and/or release notes and/or bug notes ... I got my information fromthe engineer himself:) The scheduler bugs that I am referring to have to do with the ProcessorScheduler. GsSocketDoPoll192 (where the error occurs) is involved in process scheduling and the engineer says that there were bugfixes in that area ... so the error is related to using green threads within a gem ... the code that you are asking about is related to multi-gem processing and is isn't directly related, unless you are forking a process and waiting for the results .... Dale [1] http://gemtalksystems.com/products/gs64/versions32x/ [2] http://gemtalksystems.com/products/gs64/versions31x/ [3] https://cse.google.com/cse/home?cx=014580976650604809618:1cdoq5jo3te [4] https://www.google.com/cse?cx=014580976650604809618:1cdoq5jo3te&filter=0&q=scheduler&oq=scheduler&gs_l=partner.3...4787.7298.0.9467.9.9.0.0.0.0.124.845.7j2.9.0.gsnos%2Cn%3D13...0.2536j1596992j9..1ac.1.25.partner..9.0.0.Qu3WQWj_sk0#gsc.tab=0&gsc.q=scheduler&gsc.page=1 On 06/26/2015 01:07 PM, Mariano
Martinez Peck wrote:
_______________________________________________ Glass mailing list [hidden email] http://lists.gemtalksystems.com/mailman/listinfo/glass |
Free forum by Nabble | Edit this page |