Can anybody give me some more information on how to get
around this, since I still can't get a more recent version compiled on OBSD? I guess I could just default to disabling networked bases, one
at a time (my preliminary suspect is FIDO), until it doesn't seem to happen any more, but that seems like it'd be unreliable and a really time-consuming way to get to the bottom of this. Any input
appreciated.
I know I've mentioned this before, but the bug in synchronet that a few people have talked about that pegs a thread @ 100% of cpu usage after one
of the networks (I believe- though this is a glorified assumption at this point) tries to pull messages, has been bugging me a lot more often recently. Basically at least once a day now I'm finding that after a prolonged period of time no messages have been imported to any of the networked subs, and inevitably, after I check the cpu stats, the sbbs process is pegged at 100%. A kill -15 won't kill it, after awhile I kill
-9 it, restart it, and things seem to be working again. This time around I haven't noticed any particular sub-boards being corrupted in the process, but I've trimmed down the number of sub-boards that I'm reading lately due to not enough time, and FIDONet not posting anything for me in an error
that my RC doesn't seem to be able to help me fix.
So I'm not sure exactly which networked function it may be, but when it happens it shuts down importing of all networked messages across 5
networks. It's really a hinderance, and I'd rather not have to fall back on setting up a shell script to run every hour to check for pegged usage for too long and then kill it off and restart it. That just can't be good for anything.
Can anybody give me some more information on how to get around this,
since I still can't get a more recent version compiled on OBSD? I guess I could just default to disabling networked bases, one at a time (my preliminary suspect is FIDO), until it doesn't seem to happen any more, but that seems like it'd be unreliable and a really time-consuming way to get
to the bottom of this.
Any input appreciated.
Are all of these "network" fidonet technology nets (FTNs)? If so, then
the process that handles importing and exporting would be SBBSecho, not sbbs. Which process exactly do you see with a 100% CPU utilization? What is the log output at the time that is occuring? What version of SBBS and SBBSecho are you using? Without more details, it's really hard to help.
When this is occurring, take a look in your /sbbs/data/ directory for *.now. If one exists (usually fidoin.now or fidoout.now or something similar) no other events will run until that one is done. So if no others are running, and one of those .now files exist, *that* is the one causing other events not to run.
With that, you can narrow down exactly which event is doing this. After you know that, and if it's fidoin.now, you can check /sbbs/data/sbbsecho.log for any errors importing messages during that timeframe. If it's fidoout.now check the same log for exporting errors.
Are all of these "network" fidonet technology nets (FTNs)? If so, then
the process that handles importing and exporting would be SBBSecho, not sbbs. Which process exactly do you see with a 100% CPU utilization? What is the log output at the time that is occuring? What version of SBBS and SBBSecho are you using? Without more details, it's really hard to help.
Well I caught a couple of atypical ones now. Straight up crashes,
where I've got an open session and I come back awhile later and the connection is terminated. These ones appear to be happening right
around the time that qnet-qwk.now is being created, though they don't appear to have anything in the associated .lo? file.
don't appear to have anything in the associated .lo? file.
For one, you don't ever have to associate QWK messages with .?lo files whatsoever. Two completely different transfer protocols. My question for you would be, are you hosting a QWK network? Or maybe it's when you're polling VERT for Dovenet?
Maybe check your system log and see if there's any odd things going on right around the time it crashes.
I meant what I said about .lo? files, as in the ones that accumulate
in /sbbs/data/logs/*.lo? (.log & .lol).
Maybe check your system log and see if there's any odd things
going on right around the time it crashes.
Yep, that's what I referenced doing in the above file extensions.
;)
don't appear to have anything in the associated .lo? file.
For one, you don't ever have to associate QWK messages with .?lo files
For one, you don't ever have to associate QWK messages with .?lo
files
someone confused .lo? files with .?lo files... the latter are binkley style mailer files ;)
Sysop: | MCMLXXIX |
---|---|
Location: | Prospect, CT |
Users: | 325 |
Nodes: | 10 (0 / 10) |
Uptime: | 09:35:12 |
Calls: | 510 |
Messages: | 220574 |