Loading ...
Home » Spaces » Racks, Rack Accessories, & Cooling » discussion » General » possible bug / AOS 6.4.4 / hostname-in-syslog

possible bug / AOS 6.4.4 / hostname-in-syslog

Discussion in Racks, Rack Accessories, & Cooling started by Stuart , 12/2/2016 12:55 PM
Login to follow, share, and participate in this space.
Not a member?Join now
  • stuartk
    Stuart
    Apprentice Apprentice
    Stuart 12/2/2016 12:55 PM

    When I use the Configuration... Logs... Syslog... Test function, I see the following on my loghost:

    2016-12-02T04:44:18.555318-08:00 p1n-pdu 12/02/2016 04:44:00 10.128.13.52 APC: Test Syslog.

    However, when the NMC emits any other message to syslog, I see the IP address, rather than the host name:

    2016-12-02T04:49:37-08:00 10.11.12.13 Network Interface restarting. SNMP user request from 10.11.73.12. 0x0009

    This smells buggy to me.

    BTW:  here is why I would greatly prefer seeing hostnames in syslog, rather than IP addresses.

    - My monitoring software can be configured to pick-up strings from syslog, e.g. "Look for anything containing "{stuff}-pdu" and if you see such a line, parse it for xyz problems and then send email to people abc".  Yes, it can be configured to watch for a list of IP addresses instead ... but that's a lot of complexity plus fragile (I have to maintain a list of IP addresses, rather than merely insisting that each PDU be named "{stuff}-pdu").

    - I run nightly reports which extract 'interesting' lines from syslog, again using the easy parsing of "Looking for {stuff}-pdu" -- maintaining a list of IP addresses here would be similarly complex & fragile.

    Would you agree that this is a bug?  Do you know of a work-around?  How would I report this bug to your development folks?

    --sk

  • ipicKedawinna
    Angela
    =S= Representative
    Angela 12/2/2016 2:32 PM (in response to Stuart)

    Hi Stuart,

    You already reported this to us here actually: http://forums.apc.com/spaces/7/ups-management-devices-powerchute-software/forums/general/12638/log-to-syslog-using-hostname-rather-than-ip-address/1#57210 and I went ahead and logged it as a bug for the development team to evaluate and review. I will bring it up again for the next AOS release early next year to see if we can make it happen.

  • Terry_Kennedy
    Terry
    Apprentice Apprentice
    Terry 12/5/2016 11:22 AM (in response to Stuart)

    I believe syslogd does the hostname lookup as part of the recvfrom() loop. In "classic" 4BSD-derived syslogd, this is done in  the cvthname() function. This normally indicates a problem with the DNS PTR lookup. However, the manpage says "The date and time are taken from the received message. If the format of the timestamp field is incorrect, time obtained from the local host is used instead." The timestamps of the 2 sample log lines you posted are definitely in 2 different formats, so I'm wondering if that is the problem here. Again, "classic" 4BSD-derived syslogd has a -d flag which logs a bunch of stuff, including hostname lookup errors. It doesn't log anything other than the raw message in the time conversion routine, unfortunately.

    I don't see anything in the 6.4.6 configuration menu that adjusts timestamps, so if this is an APC bug, it would seem to be in timestamp generation (probably uses a different format when doing the syslog test).

  • stuartk
    Stuart
    Apprentice Apprentice
    Stuart 12/5/2016 12:20 PM (in response to Terry)

    OK, Terry, I've been reading 'man rsyslogd' (v8.16.0) and learning about hostname lookup -- thank you for the pointer.

    On my log host:

    vishnu> ps -ef | grep syslog

    syslog     871     1  0 Dec01 ?        00:00:10 /usr/sbin/rsyslogd -n

    I took a pcap (i.e. ran dumpcap on my loghost) and grabbed two frames:  the syslog msg as emitted when I perform the 'Test' function and the syslog message emitted when I reboot the NMC.  Here are the results:

    syslog:

    2016-12-05T03:46:37.502241-08:00 p1n-pdu 12/05/2016 03:46:14 10.128.13.52 APC: Test Syslog.
    2016-12-05T03:46:49-08:00 10.12.13.52 Network Interface restarting. Web user request. 0x0009

    pcap:

    12/05/2016 03:46:14 10.12.13.52 APC: Test Syslog.

    Dec 5 03:46:49 10.12.13.52 Network Interface restarting. Web user request. 0x0009

    Two items pop out to me:

    - The test message and the reboot message differ in content

    - The hostname format (an IP address) is identical on the wire ... but different in syslog.

    I note that I register everything in DNS:

    vishnu> host p1n-pdu
    p1n-pdu.corp.alleninstitute.org has address 10.12.13.52
    vishnu> host 10.12.13.52
    52.13.12.10.in-addr.arpa domain name pointer p1n-pdu.company.com.
    vishnu>

    Although actually ... I consult /etc/hosts first for name resolution:

    vishnu> grep hosts /etc/nsswitch.conf
    hosts:          files dns
    vishnu>

    vishnu> grep 10.12.13.52 /etc/hosts
    10.12.13.52    p1n-pdu
    vishnu>

    Hmm.

    What do other boxes log?  Here is an example from an EMC storage system:

    syslog:

    2016-12-05T04:09:51-08:00 bh-dc-isi1-11 snmpd[41181]: sysctlbyname(kern.file) failed: Cannot allocate memory

    pcap:

    2016-12-05T04-09:51-08:00 bh-dc-isi1-11 snmpd[41181]: sysctlbyname(kern.file) failed:  Cannot allocate memory

    I'm puzzled.  Here's what I think I know:

    - The average host logs its hostname rather than its IP address and inserts this directly after the time stamp.

    - However, if the host logs its IP address instead -- as AOS does -- then syslogd would, I predict, perform name resolution.  [Because that is the default, and I have not disabled it using the '-x' flag.]

    - And my syslog daemon does, when presented with the Test message ... but does not normally.

    ==> Why?  I see no reason for rsyslogd to behave differently, when presented with these two messages.

    I've been trying to load rsyslogd with the '-d' flag ... which I can do ... to examine any name resolution lines .. but then I haven't figured out where to look for its debug output.  I hope to return to this task later in the week and post the results.  In the meantime, if you have additional pointers, do let me know.

    --sk

  • Terry_Kennedy
    Terry
    Apprentice Apprentice
    Terry 12/5/2016 12:58 PM (in response to Stuart)

    It looks like it is definitely due to the timestamp format:

    2016-12-05T03:46:37.502241-08:00 p1n-pdu 12/05/2016 03:46:14 10.128.13.52 APC: Test Syslog.
    2016-12-05T03:46:49-08:00 10.12.13.52 Network Interface restarting. Web user request. 0x0009

    In both cases the IP address is making it into syslog (bolded part). The difference seems to be that the 2nd message has had its timestamp removed between being received and being logged, since it is in the pcap.

    Since you mention rsyslogd, this is probably a Linux system. I looked at the GIT for rsyslog and it is a giant plate of spaghetti compared to the BSD one. I couldn't quickly find its timestamp-parsing code. And it probably uses a completely different set of flags, so -d may not do the same thing.

    If you want, I can try this on a FreeBSD system and see what syslogd does with both test and regular messages.

    Do me a favor and "like" this post so I'll get an alert when I visit the site again - otherwise I have to look around each forum for replies to posts I've made.

    And it seems this forum is still using a tag for bolding that is unrecognized by Internet Explorer. The 10.128.13.52 in the 2 syslog lines should be bold (and will be, if you aren't using IE).

  • stuartk
    Stuart
    Apprentice Apprentice
    Stuart 12/5/2016 1:27 PM (in response to Terry)

    man rsyslogd

    [...]

           -d     Turns on debug mode. See the DEBUGGING section for more information.
    Ubuntu 16.04

    Terry -- yes please -- if you're willing to provide the FreeBSD behavior, I would appreciate that.

    I would like to provide Angela with best proposal for changes in AOS (if any).

    --sk

  • Terry_Kennedy
    Terry
    Apprentice Apprentice
    Terry 12/5/2016 7:34 PM (in response to Stuart)

    My syslogd does somewhat better:

    Dec  5 13:55:10 ups2 12/05/2016 13:55:10 10.20.30.17 APC: Test Syslog.
    Dec  5 13:56:06 ups2 10.20.30.17 Environment: Lost the local network management interface-to-integrated Environmental Monitor (Universal I/O at Port 1) communication. 0x0345

    Packet contents:

    <176>12/05/2016 13:55:10 10.20.30.17 APC: Test Syslog.
    <178>Dec  5 13:56:06 10.20.30.17 Environment: Lost the local network management interface-to-integrated Environmental Monitor (Universal I/O at Port 1) communication. 0x0345

    So, the APC is sending 2 different formats, one for test messages, and a different one for actual messages. The actual message complies with RFC 3164 (page 9, "TIMESTAMP") while the test message is neither RFC 3164 nor RFC 5424 (page 11, "TIMESTAMP").

    Note that APC correctly fixing this bug will apparently give you syslog entries where neither test nor actual messages have the hostname. That is apparently a bug (actually, a pair of bugs) in rsyslogd. I took a quick look at the rsyslogd documentation and it appears that setting "fromhost" in the rsyslogd configuration (perhaps in a stanza restricted to your APC management cards) may do what you want. However, since it appears rsyslogd is mis-parsing the message, it may not work. I suggest bringing this up on the rsyslogd mailing list and seeing what people more familiar with that software think about it.

    My syslog entries and the packet captures above are from NMC2 firmware 6.4.6 on a Symmetra RM6K. I repeated the packet captures on an older NMC running 3.9.1 on a Matrix 5000 and I get:

    <176>Dec  5 14:28:12 10.20.30.20 APC: Test Syslog.
    <178>Dec  5 14:28:46 10.20.30.20 Environment: Lost the local network management interface-to-integrated Environmental Monitor Input TH Sensor (Integrated at Int Sensor Loc) communication. 0x0345

    So, the test message format is correct in the older card's firmware.

  • stuartk
    Stuart
    Apprentice Apprentice
    Stuart 2/27/2017 11:22 PM (in response to Terry)

    For posterity, here is what I did:

    Three Cases

    (1) Some devices log their hostname in the syslog message

    (2) Some devices (e.g. APC devices) log their IP addresses in the syslog message

    (3) Some devices log neither

    On my rsyslog-8.0.16 installation (bundled with Ubuntu 16.04) and rsyslog-7.4.7 installation (bundled with CentOS 7), rsyslog handles #3 just fine -- it inserts hostname (as performed by gegthostbyip).

    And I used the following rsyslog.conf incantations to handle #2, i.e. to convert IP addresses to hostnames

    template(name="ReplaceIPWithHostname" type="list") {
        property(name="timestamp" dateFormat="rfc3339")
        constant(value=" ")
        property(name="fromhost")
        constant(value=" ")
        property(name="syslogtag")
        property(name="msg" spifno1stsp="on" )
        property(name="msg" droplastlf="on" )
        constant(value="\n")
    }

    #
    # Log most everything.  If host sent its IP address instead of its hostname,
    # translate IP address to hostname
    #
    if re_match($hostname, '[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}\\.[0-9]{1,3}') then
     {
      *.*                                               /var/log/syslog;ReplaceIPWithHostname
      stop
    }
    *.*                                                 /var/log/syslog

  • Page 1 of 1 (8 items)
Choose your language:  
powered by Communifire
Version 5.2.6402.40375