• 100% cpu usage

    From Al@VERT/TRMB to Digital Man on Thursday, June 28, 2018 19:45:37
    Hello Digital Man,

    After three to four days I begin to see one or two of the sbbs processes using 100% cpu. I am seeing this in htop and I can see the pid of the processes but I don't know what those processes are doing. htop says there are two pids using 100% cpu but the graph shows one cpu at 100%.

    I just shut down sbbs and restarted and all is well now. I notice that my ftn inbound directory fills with files but are not being tossed or sent to links so that is why I went into htop for a look and saw that.

    It is likely caused by something that I have misconfigured but I am not sure where to begin looking. Is there some way I can find out what a given pid is doing or trying to do?

    Ttyl :-),
    Al


    ... Cursor: An expert in four-letter words

    ---
    þ Synchronet þ The Rusty MailBox - Penticton, BC Canada
  • From Digital Man@VERT to Al on Friday, June 29, 2018 00:25:56
    Re: 100% cpu usage
    By: Al to Digital Man on Thu Jun 28 2018 07:45 pm

    Hello Digital Man,

    After three to four days I begin to see one or two of the sbbs processes using 100% cpu.

    I'm pretty sure you mean "threads" when you say processes.

    I am seeing this in htop and I can see the pid of the
    processes but I don't know what those processes are doing. htop says there are two pids using 100% cpu but the graph shows one cpu at 100%.

    I'm not familiar with 'htop'. But if you use 'top -H', it'll show you the individual threads of multi-threaded processes along with the threads' names (sbbs thread names start with "sbbs/"), for example:

    27422 sbbs 20 0 2401552 243088 10016 S 0.0 1.6 7:37.30 sbbs/sendMail
    27304 sbbs 20 0 2401552 243088 10016 S 0.0 1.6 0:00.00 sbbs/httpSess
    27293 sbbs 20 0 2401552 243088 10016 S 0.0 1.6 0:00.00 sbbs/httpSess
    27272 sbbs 20 0 2401552 243088 10016 S 0.0 1.6 0:00.00 sbbs/httpOutput
    27266 sbbs 20 0 2401552 243088 10016 S 0.0 1.6 0:00.02 sbbs/httpSess

    I just shut down sbbs and restarted and all is well now. I notice that my ftn inbound directory fills with files but are not being tossed or sent to links so that is why I went into htop for a look and saw that.

    sbbs doesn't "toss files" (sbbsecho does). So... maybe double-check whatever is supposed to be running sbbsecho?

    It is likely caused by something that I have misconfigured but I am not sure where to begin looking. Is there some way I can find out what a given pid is doing or trying to do?

    The thread names is the first clue. Then from there we can check out log entries to see the last thing that thread(s) in question were doing.

    digital man

    Synchronet/BBS Terminology Definition #8:
    BSO = Binkley Style Outbound
    Norco, CA WX: 59.4øF, 88.0% humidity, 3 mph SE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ [vert/cvs/bbs].synchro.net
  • From Al@VERT/TRMB to Digital Man on Friday, June 29, 2018 07:56:08
    Re: 100% cpu usage
    By: Digital Man to Al on Fri Jun 29 2018 12:25 am

    I'm pretty sure you mean "threads" when you say processes.

    Yes, I think so.

    I'm not familiar with 'htop'. But if you use 'top -H', it'll show you the individual threads of multi-threaded processes along with the threads' names (sbbs thread names start with "sbbs/"), for example:

    htop is similar to top but it gives you a graph of cpu(s), memory and swap usage at the top of the screen. It's pretty but I only see exec/sbbs, not the sbbs/* that I see in top.

    Yes, I can see the threads described better in top. I'll use that to get a better idea of where to look the next time it happens.

    sbbs doesn't "toss files" (sbbsecho does). So... maybe double-check whatever is supposed to be running sbbsecho?

    It is the FIDOIN timed event that runs sbbsecho so that stops running for some reason. Either the data/fidoin.now is not being touched or acted upon although BinkIT seems to continue to run.

    The thread names is the first clue. Then from there we can check out log entries to see the last thing that thread(s) in question were doing.

    I see that in top now so that will give me a better idea of what's happening.

    Thanks!

    Ttyl :-),
    Al


    ... Almost everything in life is easier to get into than out of.

    ---
    þ Synchronet þ The Rusty MailBox - Penticton, BC Canada
  • From Digital Man@VERT to Al on Friday, June 29, 2018 12:04:37
    Re: 100% cpu usage
    By: Al to Digital Man on Fri Jun 29 2018 07:56 am

    Re: 100% cpu usage
    By: Digital Man to Al on Fri Jun 29 2018 12:25 am

    I'm pretty sure you mean "threads" when you say processes.

    Yes, I think so.

    I'm not familiar with 'htop'. But if you use 'top -H', it'll show you the individual threads of multi-threaded processes along with the threads' names (sbbs thread names start with "sbbs/"), for example:

    htop is similar to top but it gives you a graph of cpu(s), memory and swap usage at the top of the screen. It's pretty but I only see exec/sbbs, not the sbbs/* that I see in top.

    Yes, I can see the threads described better in top. I'll use that to get a better idea of where to look the next time it happens.

    sbbs doesn't "toss files" (sbbsecho does). So... maybe double-check whatever is supposed to be running sbbsecho?

    It is the FIDOIN timed event that runs sbbsecho so that stops running for some reason. Either the data/fidoin.now is not being touched or acted upon although BinkIT seems to continue to run.

    Or it could be that you have a timed-event that is running in an infinite loop? This would explain both high CPU utilization and the failure to run any more timed events.

    digital man

    Synchronet/BBS Terminology Definition #33:
    KD = King Drafus (Allen Christiansen)
    Norco, CA WX: 73.3øF, 56.0% humidity, 3 mph NW wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ [vert/cvs/bbs].synchro.net
  • From Al@VERT/TRMB to Digital Man on Friday, June 29, 2018 18:55:11
    Re: 100% cpu usage
    By: Digital Man to Al on Fri Jun 29 2018 12:04 pm

    Or it could be that you have a timed-event that is running in an infinite loop?

    All of my timed events are stock events from the default install with the exception of binkit and tickit that I installed later.

    TTBOMK they are all working as expected.

    This would explain both high CPU utilization and the failure to run
    any more timed events.

    If/when that happens again I'll let you know what I see in top.

    Ttyl :-),
    Al


    ... IBM: It may be slow, but at least it's expensive.

    ---
    þ Synchronet þ The Rusty MailBox - Penticton, BC Canada
  • From Al@VERT/TRMB to Digital Man on Sunday, July 08, 2018 13:02:24
    Re: 100% cpu usage
    By: Digital Man to Al on Fri Jun 29 2018 12:04 pm

    It is the FIDOIN timed event that runs sbbsecho so that stops running
    for some reason. Either the data/fidoin.now is not being touched or
    acted upon although BinkIT seems to continue to run.

    Or it could be that you have a timed-event that is running in an infinite loop? This would explain both high CPU utilization and the failure to run any more timed events.

    I think this is what is happening. It doesn't happen all the time, I see it every four days or so. I saw it happening earlier today and I think this log entry shows what is happening.


    Jul 8 07:00:42 trmb synchronet: evnt poll_node 11:1/100@wwivftn, outbound_dir:
    /sbbs/fido/outbound
    Jul 8 07:00:42 trmb synchronet: evnt Locking /sbbs/fido/outbound.00b/00010064. bsy
    Jul 8 07:00:42 trmb synchronet: evnt Lock successful.
    Jul 8 07:00:42 trmb synchronet: evnt Attempting poll for node 11:1/100@wwivftn Jul 8 07:00:42 trmb synchronet: evnt JSBinkP/1.106 callout to 11:1/100@wwivftn
    started


    The poll seems to end there. Searching my log for ": evnt " after this turns up nothing and one of my cores is sitting at 100% cpu until I restart sbbs. Below is a successful poll for the same node I scooped from the log. I haven't noticed any problems polling that node.


    Jul 6 03:50:19 trmb synchronet: evnt poll_node 11:1/100@wwivftn, outbound_dir:
    /sbbs/fido/outbound
    Jul 6 03:50:19 trmb synchronet: evnt Locking /sbbs/fido/outbound.00b/00010064. bsy
    Jul 6 03:50:19 trmb synchronet: evnt Lock successful.
    Jul 6 03:50:19 trmb synchronet: evnt Attempting poll for node 11:1/100@wwivftn Jul 6 03:50:19 trmb synchronet: evnt JSBinkP/1.106 callout to 11:1/100@wwivftn
    started
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_NUL command args: OPT CRYPT
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_NUL command args: SYS The Rusty Ma ilBox
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_NUL command args: ZYZ Alan Ianson Jul 6 03:50:20 trmb synchronet: evnt Sent M_NUL command args: LOC Penticton, B C Canada
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_NUL command args: NDL 115200,TCP,B INKP
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_NUL command args: TIME Fri Jul 06 2018 03:50:20 GMT-0700 (PDT)
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_NUL command args: VER BinkIT/2.4,J SBinkP/1.106,sbbs3.17a/Linux binkp/1.1
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_ADR command args: 11:1/201@wwivftn
    44:250/0@dorenet 1:153/0@fidonet 1:153/757@fidonet 21:4/106@fsxnet 32:1/5@game net 637:1/104@happynet 24:240/2@sportnet 24:240/1@sportnet 24:240/0@sportnet 31 6:403/5@whispnet
    Jul 6 03:50:20 trmb synchronet: evnt Got M_NUL command args: OPT CRAM-MD5-9d1c bfefb19c70026bc55727183126ee
    Jul 6 03:50:20 trmb synchronet: evnt Got M_NUL command args: SYS Weather Stati on Hub
    Jul 6 03:50:20 trmb synchronet: evnt Got M_NUL command args: ZYZ Mark Hofmann Jul 6 03:50:20 trmb synchronet: evnt Got M_NUL command args: LOC Bel Air, Mary land/USA
    Jul 6 03:50:20 trmb synchronet: evnt Got M_NUL command args: NDL 9600,TCP,BINK P
    Jul 6 03:50:20 trmb synchronet: evnt Got M_NUL command args: TIME Fri, 6 Jul 2018 06:50:20 -0400
    Jul 6 03:50:20 trmb synchronet: evnt Got M_NUL command args: VER binkd/1.1a-95 /Win32 binkp/1.1
    Jul 6 03:50:20 trmb synchronet: evnt Peer version: binkd/1.1a-95/Win32 binkp/1 .1
    Jul 6 03:50:20 trmb synchronet: evnt Got M_ADR command args: 1:261/1@fidonet 1:261/0@fidonet 11:1/100@wwivftn 21:1/131@fsxnet 723:1/3@dovenet 44:443/0@doren et 46:1/163@agoranet 618:100/13@micronet
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_PWD command args: CRAM-MD5-659c52b dc7a4fa213f3d7aa016b5241c
    Jul 6 03:50:20 trmb synchronet: evnt Got M_NUL command args: TRF 0 0
    Jul 6 03:50:20 trmb synchronet: evnt Got M_NUL command args: OPT EXTCMD CRYPT GZ
    Jul 6 03:50:20 trmb synchronet: evnt Will encrypt session.
    Jul 6 03:50:20 trmb synchronet: evnt Got M_OK command args: secure
    Jul 6 03:50:20 trmb synchronet: evnt Authentication successful: secure
    Jul 6 03:50:20 trmb synchronet: evnt Unconfigured address 1:261/1@fidonet
    Jul 6 03:50:20 trmb synchronet: evnt Unconfigured address 1:261/0@fidonet
    Jul 6 03:50:20 trmb synchronet: evnt Unconfigured address 21:1/131@fsxnet
    Jul 6 03:50:20 trmb synchronet: evnt Unconfigured address 723:1/3@dovenet
    Jul 6 03:50:20 trmb synchronet: evnt Unconfigured address 44:443/0@dorenet
    Jul 6 03:50:20 trmb synchronet: evnt Unconfigured address 46:1/163@agoranet Jul 6 03:50:20 trmb synchronet: evnt Unconfigured address 618:100/13@micronet Jul 6 03:50:20 trmb synchronet: evnt Adding outbound files for 11:1/100@wwivft n
    Jul 6 03:50:20 trmb synchronet: evnt Initializing crypt keys.
    Jul 6 03:50:20 trmb synchronet: evnt Got M_EOB command args:
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_EOB command args:
    Jul 6 03:50:20 trmb synchronet: evnt Sent M_EOB command args:
    Jul 6 03:50:20 trmb synchronet: evnt Remote host closed socket
    Jul 6 03:50:20 trmb synchronet: evnt Unlocking /sbbs/fido/outbound.00b/0001006 4.bsy.

    Ttyl :-),
    Al


    ... Thesaurus: ancient reptile with an excellent vocabulary

    ---
    þ Synchronet þ The Rusty MailBox - Penticton, BC Canada
  • From Al@VERT/TRMB to Digital Man on Sunday, July 08, 2018 13:05:10
    Re: 100% cpu usage
    By: Al to Digital Man on Sun Jul 08 2018 01:02 pm

    I think this is what is happening. It doesn't happen all the time, I see it every four days or so. I saw it happening earlier today and I think this log entry shows what is happening.

    I should have mentioned, top says the thread is SBBS/Events.

    Ttyl :-),
    Al


    ... Spent all my money on booze, boats & blonds. The rest was wasted.

    ---
    þ Synchronet þ The Rusty MailBox - Penticton, BC Canada
  • From Al@VERT/TRMB to Digital Man on Sunday, July 08, 2018 13:32:25
    Re: 100% cpu usage
    By: Al to Digital Man on Sun Jul 08 2018 01:02 pm

    11:1/100@wwivftn Jul 8 07:00:42 trmb synchronet: evnt JSBinkP/1.106 callout to 11:1/100@wwivftn
    started

    I had my cursor misplaced when I cut and pasted that.. there is one more line..

    Jul 8 07:00:42 trmb synchronet: evnt poll_node 11:1/100@wwivftn, outbound_dir:
    /sbbs/fido/outbound
    Jul 8 07:00:42 trmb synchronet: evnt Locking /sbbs/fido/outbound.00b/00010064. bsy
    Jul 8 07:00:42 trmb synchronet: evnt Lock successful.
    Jul 8 07:00:42 trmb synchronet: evnt Attempting poll for node 11:1/100@wwivftn Jul 8 07:00:42 trmb synchronet: evnt JSBinkP/1.106 callout to 11:1/100@wwivftn
    started
    Jul 8 07:00:43 trmb synchronet: evnt Sent M_NUL command args: OPT CRYPT

    Sorry about that.. :)

    Ttyl :-),
    Al


    ... You cannot achieve the impossible without attempting the absurd.

    ---
    þ Synchronet þ The Rusty MailBox - Penticton, BC Canada
  • From Digital Man@VERT to Al on Sunday, July 08, 2018 14:52:10
    Re: 100% cpu usage
    By: Al to Digital Man on Sun Jul 08 2018 01:02 pm

    Re: 100% cpu usage
    By: Digital Man to Al on Fri Jun 29 2018 12:04 pm

    It is the FIDOIN timed event that runs sbbsecho so that stops running
    for some reason. Either the data/fidoin.now is not being touched or
    acted upon although BinkIT seems to continue to run.

    Or it could be that you have a timed-event that is running in an infinite loop? This would explain both high CPU utilization and the failure to run any more timed events.

    I think this is what is happening. It doesn't happen all the time, I see it every four days or so. I saw it happening earlier today and I think this log entry shows what is happening.

    I've added a bit more log output to binkit.js and binkp.js. When it happens again (I didn't fix anything), please send me the log output.

    I suspect it might be blocking in connect(), but that would be weird and wouldn't normally cause any busy-looping even if it did. But we'll get to the bottom of it with logging.

    digital man

    Synchronet/BBS Terminology Definition #3:
    ASCII = American Standard Code for Information Interchange
    Norco, CA WX: 98.5øF, 22.0% humidity, 18 mph ENE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ [vert/cvs/bbs].synchro.net
  • From Digital Man@VERT to Al on Sunday, July 08, 2018 14:52:26
    Re: 100% cpu usage
    By: Al to Digital Man on Sun Jul 08 2018 01:05 pm

    Re: 100% cpu usage
    By: Al to Digital Man on Sun Jul 08 2018 01:02 pm

    I think this is what is happening. It doesn't happen all the time, I see it every four days or so. I saw it happening earlier today and I think this log entry shows what is happening.

    I should have mentioned, top says the thread is SBBS/Events.

    Which makes complete sense now.

    digital man

    This Is Spinal Tap quote #2:
    Nigel Tufnel: Well, this piece is called "Lick My Love Pump".
    Norco, CA WX: 98.5øF, 22.0% humidity, 18 mph ENE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ [vert/cvs/bbs].synchro.net
  • From Digital Man@VERT to Al on Sunday, July 08, 2018 14:55:09
    Re: 100% cpu usage
    By: Al to Digital Man on Sun Jul 08 2018 01:32 pm

    Re: 100% cpu usage
    By: Al to Digital Man on Sun Jul 08 2018 01:02 pm

    11:1/100@wwivftn Jul 8 07:00:42 trmb synchronet: evnt JSBinkP/1.106 callout to 11:1/100@wwivftn
    started

    I had my cursor misplaced when I cut and pasted that.. there is one more line..

    Jul 8 07:00:42 trmb synchronet: evnt poll_node 11:1/100@wwivftn, outbound_dir:
    /sbbs/fido/outbound
    Jul 8 07:00:42 trmb synchronet: evnt Locking /sbbs/fido/outbound.00b/00010064. bsy
    Jul 8 07:00:42 trmb synchronet: evnt Lock successful.
    Jul 8 07:00:42 trmb synchronet: evnt Attempting poll for node 11:1/100@wwivftn Jul 8 07:00:42 trmb synchronet: evnt JSBinkP/1.106 callout to 11:1/100@wwivftn
    started
    Jul 8 07:00:43 trmb synchronet: evnt Sent M_NUL command args: OPT CRYPT

    Sorry about that.. :)

    Ah, okay, that helps a lot. It's not blocking on connect then. I may even see the problem. Try the next commit of load/binkp.js out. :-)

    digital man

    Synchronet "Real Fact" #32:
    The second most prolific contributor to Synchronet is Stephen Hurd (Deuce). Norco, CA WX: 98.5øF, 22.0% humidity, 18 mph ENE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ [vert/cvs/bbs].synchro.net
  • From Al@VERT/TRMB to Digital Man on Sunday, July 08, 2018 15:58:04
    Re: 100% cpu usage
    By: Digital Man to Al on Sun Jul 08 2018 02:55 pm

    Ah, okay, that helps a lot. It's not blocking on connect then. I may even see the problem. Try the next commit of load/binkp.js out. :-)

    I just updated and see this error now.

    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP connection accepted from: 68.1 02.199.252 port 61737
    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP JavaScript service thread star ted
    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP !JavaScript /sbbs/exec/binkit .js line 31: InternalError: too much recursion
    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP service thread terminated (0 c lients remain, 0 total, 33 served)

    Ttyl :-),
    Al


    ... 75%, no, 72% of all statistics are made up on the spot.

    ---
    þ Synchronet þ The Rusty MailBox - Penticton, BC Canada
  • From Digital Man@VERT to Al on Sunday, July 08, 2018 20:16:08
    Re: 100% cpu usage
    By: Al to Digital Man on Sun Jul 08 2018 03:58 pm

    Re: 100% cpu usage
    By: Digital Man to Al on Sun Jul 08 2018 02:55 pm

    Ah, okay, that helps a lot. It's not blocking on connect then. I may even see the problem. Try the next commit of load/binkp.js out. :-)

    I just updated and see this error now.

    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP connection accepted from: 68.1 02.199.252 port 61737
    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP JavaScript service thread star ted
    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP !JavaScript /sbbs/exec/binkit .js line 31: InternalError: too much recursion

    Fixed.

    digital man

    Synchronet "Real Fact" #29:
    Rob Swindell first called BBSes (at 300bps) with an Apple II computer in 1982. Norco, CA WX: 84.7øF, 39.0% humidity, 3 mph ESE wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ [vert/cvs/bbs].synchro.net
  • From Digital Man@VERT to Al on Tuesday, July 10, 2018 00:03:48
    Re: 100% cpu usage
    By: Al to Digital Man on Sun Jul 08 2018 03:58 pm

    Re: 100% cpu usage
    By: Digital Man to Al on Sun Jul 08 2018 02:55 pm

    Ah, okay, that helps a lot. It's not blocking on connect then. I may even see the problem. Try the next commit of load/binkp.js out. :-)

    I just updated and see this error now.

    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP connection accepted from: 68.1 02.199.252 port 61737
    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP JavaScript service thread star ted
    Jul 8 15:54:53 trmb synchronet: srvc 0054 BINKP !JavaScript /sbbs/exec/binkit .js line 31: InternalError: too much recursion

    That error should be fixed now.

    digital man

    Synchronet/BBS Terminology Definition #28:
    IHA = Ille Homine Albe (Steve Deppe)
    Norco, CA WX: 75.4øF, 62.0% humidity, 0 mph SSW wind, 0.00 inches rain/24hrs

    ---
    þ Synchronet þ Vertrauen þ Home of Synchronet þ [vert/cvs/bbs].synchro.net
  • From Al@VERT/TRMB to Digital Man on Tuesday, July 10, 2018 10:30:53
    Re: 100% cpu usage
    By: Digital Man to Al on Tue Jul 10 2018 12:03 am

    That error should be fixed now.

    Yep, it is. I'm running BinkIT 2.6 and JSBinkP 1.109 now and all is well.

    Thanks.. :)

    Ttyl :-),
    Al


    ... Math problems? Call 1-800-10*(24+13)-(64-16)/2^14E2.

    ---
    þ Synchronet þ The Rusty MailBox - Penticton, BC Canada