• OpenBSD Synchronet 3.16 crash (connection lost/process killed variety)

    From Khelair@VERT/TINFOIL to All on Saturday, December 13, 2014 10:17:45
    So, I finally got another crash. FWIW, right now I've grouped the
    crashes into 3 distinct categories, which is really weird, as most of
    them didn't seem to be happening before. I'm going to take my system
    down and memtest86 it and do a few other things to rule out any sort of hardware issues contributing to this.
    Anyway, this is still on OBSD, Synchronet 3.16. The crash was one of
    the variety where all connections are dropped. Running things from the
    console with debugging on, here's what I gleaned:
    In the 'screen' session where I have the process attached, the vast
    majority of the scrollback was completely full with:
    pthread_mutex_destroy on mutex with waiters!

    At the very end there were a few more useful bits, indicating a
    segfault, and the like:
    pthread_mutex_destroy on mutex with waiters!
    12/13 02:37:58 evnt Running timed event: NEWSLINK
    12/13 02:37:58 evnt Executing external: ?newslink
    /sbbs/ctrl/newslink.cfg
    12/13 02:37:58 evnt Synchronet NewsLink 1.102 session started
    12/13 02:37:58 evnt server: news.eternal-september.org
    12/13 02:37:58 evnt 69 areas
    12/13 02:37:58 evnt Connecting to news.eternal-september.org port 119
    ...
    [Threads: 18 Sockets: 34 Clients: 3 Served: 125 Errors: 6] (?=Help): Segmentation fault

    That's really all I've got right now. Please don't be afraid to tell
    me I'm missing something stupid for more information... My deductive
    reasoning and troubleshooting hasn't been the best since some recent
    emotional issues. :P
    TIA

    ---
    þ Synchronet þ Tinfoil Tetrahedron BBS telnet://tinfoil.synchro.net
  • From Digital Man@VERT to Khelair on Saturday, December 13, 2014 18:07:52
    Re: OpenBSD Synchronet 3.16 crash (connection lost/process killed variety)
    By: Khelair to All on Sat Dec 13 2014 10:17 am

    [Threads: 18 Sockets: 34 Clients: 3 Served: 125 Errors: 6] (?=Help): Segmentation fault

    That's really all I've got right now.

    That means that you could (or should have) a core dump. Use gdb on the core dump to extract a backtrace and post that here.

    digital man

    Synchronet "Real Fact" #3:
    Synchronet version 3 is written mostly in C, with some C++, x86 ASM, and Pascal.
    Norco, CA WX: 54.5øF, 69.0% humidity, 0 mph ENE wind, 0.02 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Khelair@VERT/TINFOIL to Digital Man on Wednesday, December 17, 2014 11:48:38
    Re: OpenBSD Synchronet 3.16 crash (connection lost/process killed variety)
    By: Digital Man to Khelair on Sat Dec 13 2014 18:07:52

    [Threads: 18 Sockets: 34 Clients: 3 Served: 125 Errors: 6]
    (?=Help): Segmentation fault

    That's really all I've got right now.

    That means that you could (or should have) a core dump. Use gdb on the core dump to extract a backtrace and post that here.

    I've gotten this advice and I've looked before. I have nothing except a couple of scfg.core files laying around in my /sbbs tree. Right now I'm doing a comprehensive search over my entire filesystem for anything matching *.core, and searching it for sbbs. We'll see if anything turns up there, but I don't believe that I've found anything before.
    I'm not sure if there's something I'm doing wrong with OS settings that might be ditching a core file here, or what, but I've looked for these things before. :P
    aaaaand yessir, no sbbs core files anywhere. I did a find across the entire filesystem, as root, looking for *.core, and then searched it for 'sbbs', and this is all that I got:

    ./sbbs/3rdp/src/cl/session/scorebrd.c
    ./sbbs/3rdp/src/cl/session/scorebrd.h
    ./sbbs/3rdp/src/cl/static-obj/scorebrd.o
    ./sbbs/ctrl/scfg.core
    ./sbbs/data/dirs/misc/ecorephy.pdf
    ./sbbs/data/text/survival/corekit.txt
    ./sbbs/data/scfg.core
    ./sbbs/xtrn/chickendelivery/netscore.ans
    ./sbbs/xtrn/druglord/highscore.ans
    ./sbbs/xtrn/gooble/scores.bin

    Any suggestions? I'll look into possible OBSD settings that might be keeping it from dropping a core, but I'm kind of stumped for now.

    ---
    þ Synchronet þ Tinfoil Tetrahedron BBS telnet://tinfoil.synchro.net
  • From Digital Man@VERT to Khelair on Wednesday, December 17, 2014 18:46:28
    Re: OpenBSD Synchronet 3.16 crash (connection lost/process killed variety)
    By: Khelair to Digital Man on Wed Dec 17 2014 11:48 am

    Re: OpenBSD Synchronet 3.16 crash (connection lost/process killed
    variety)
    By: Digital Man to Khelair on Sat Dec 13 2014 18:07:52

    [Threads: 18 Sockets: 34 Clients: 3 Served: 125 Errors: 6]
    (?=Help): Segmentation fault

    That's really all I've got right now.

    That means that you could (or should have) a core dump. Use gdb on the core dump to extract a backtrace and post that here.

    I've gotten this advice and I've looked before. I have nothing except a couple of scfg.core files laying around in my /sbbs tree. Right now I'm doing a comprehensive search over my entire filesystem for anything
    matching *.core, and searching it for sbbs. We'll see if anything turns up there, but I don't believe that I've found anything before.
    I'm not sure if there's something I'm doing wrong with OS settings that might be ditching a core file here, or what, but I've looked for these things before. :P
    aaaaand yessir, no sbbs core files anywhere. I did a find across the entire filesystem, as root, looking for *.core, and then searched it for 'sbbs', and this is all that I got:

    ./sbbs/3rdp/src/cl/session/scorebrd.c
    ./sbbs/3rdp/src/cl/session/scorebrd.h ./sbbs/3rdp/src/cl/static-obj/scorebrd.o
    ./sbbs/ctrl/scfg.core
    ./sbbs/data/dirs/misc/ecorephy.pdf
    ./sbbs/data/text/survival/corekit.txt
    ./sbbs/data/scfg.core
    ./sbbs/xtrn/chickendelivery/netscore.ans
    ./sbbs/xtrn/druglord/highscore.ans
    ./sbbs/xtrn/gooble/scores.bin

    Any suggestions?

    The file could be named something else (like core.<pid>.sbbs), it depends on your system configuration.

    I'll look into possible OBSD settings that might be
    keeping it from dropping a core, but I'm kind of stumped for now.

    On Debian Linux, it's controlled with the /etc/sysctl.conf file:
    # Controls whether core dumps will append the PID to the core filename.
    # Useful for debugging multi-threaded applications.
    kernel.core_uses_pid = 1
    kernel.core_pattern = /tmp/core.%e.%p

    ... in addition to the core file size limit imposed on the user ('ulimit -c').

    If you're not getting a core dump when a process segfaults, then should reconfigure your system to do so (for purposes of debugging).

    digital man

    Synchronet "Real Fact" #38:
    Synchronet first supported Windows NT v6.x (a.k.a. Vista/Win7) w/v3.14a (2006). Norco, CA WX: 54.6øF, 70.0% humidity, 3 mph E wind, 0.04 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Khelair@VERT/TINFOIL to Digital Man on Thursday, December 18, 2014 13:37:34
    Re: OpenBSD Synchronet 3.16 crash (connection lost/process killed variety)
    By: Digital Man to Khelair on Wed Dec 17 2014 18:46:28

    Any suggestions?

    The file could be named something else (like core.<pid>.sbbs), it depends on your system configuration.

    I'll look into possible OBSD settings that might be
    keeping it from dropping a core, but I'm kind of stumped for now.

    ... in addition to the core file size limit imposed on the user ('ulimit -c').

    If you're not getting a core dump when a process segfaults, then should reconfigure your system to do so (for purposes of debugging).

    Okay. Sorry about the last unnecessary poast, then. I'll up my ulimit, and get on with the OpenBSD folks if I can't find anything in a FAQ so that I can get my greasy mitts on this information. Thanks for the pointers.

    ---
    þ Synchronet þ Tinfoil Tetrahedron BBS telnet://tinfoil.synchro.net
  • From Khelair@VERT/TINFOIL to Digital Man on Thursday, December 18, 2014 13:39:28
    Re: OpenBSD Synchronet 3.16 crash (connection lost/process killed variety)
    By: Digital Man to Khelair on Wed Dec 17 2014 18:46:28

    FWIW, I just found ulimit settings and my ulimit -c is set to unlimited, so I'm invoking the OBSD gurus here. Stay tuned and, as always, thnx & TIA for
    in the future.

    ---
    þ Synchronet þ Tinfoil Tetrahedron BBS telnet://tinfoil.synchro.net
  • From Digital Man@VERT to Khelair on Thursday, December 18, 2014 16:44:38
    Re: OpenBSD ulimits interfering w/core dumps
    By: Khelair to Digital Man on Thu Dec 18 2014 01:39 pm

    Re: OpenBSD Synchronet 3.16 crash (connection lost/process killed
    variety)
    By: Digital Man to Khelair on Wed Dec 17 2014 18:46:28

    FWIW, I just found ulimit settings and my ulimit -c is set to unlimited, so I'm invoking the OBSD gurus here. Stay tuned and, as always, thnx & TIA for in the future.

    ulimit reports and changes the limits imposed on the current user. If you run ulimit under your personal account, but run the BBS as a different user, then that doesn't help. You need to run ulimit as the same user you run the BBS as. If you login to your BBS via Telnet and ;SHELL and run 'ulimit -c' and it still
    reports unlimited, then likely, you are getting core dumps if/when the BBS segfaults, but you just don't know where they're being stored.

    digital man

    Synchronet "Real Fact" #22:
    The third ever Synchronet BBS was The Beast's Domain (sysop: King Drafus). Norco, CA WX: 57.0øF, 68.0% humidity, 4 mph SSE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ telnet://vert.synchro.net
  • From Khelair@VERT/TINFOIL to Digital Man on Thursday, December 18, 2014 18:41:28
    Re: OpenBSD ulimits interfering w/core dumps
    By: Digital Man to Khelair on Thu Dec 18 2014 16:44:38

    ulimit reports and changes the limits imposed on the current user. If
    you run ulimit under your personal account, but run the BBS as a
    different user, then that doesn't help. You need to run ulimit as the
    same user you run the BBS as. If you login to your BBS via Telnet and ;SHELL and run 'ulimit -c' and it still reports unlimited, then likely, you are getting core dumps if/when the BBS segfaults, but you just don't know where they're being stored.

    Yeah. I had checked as the user that I run the BBS as before, but ran it again through SHELL just in case... Still unlimited. I'll do a more comprehensive find on the filesystem and still talk to the developers. :P

    ---
    þ Synchronet þ Tinfoil Tetrahedron BBS telnet://tinfoil.synchro.net