Loading ...

Symmetra RM showing an "internal communication failure" | Smart-UPS & Symmetra LX / RM

Home » Spaces » Smart-UPS & Symmetra LX / RM » discussion » General » Symmetra RM showing an "internal communication failure"

Symmetra RM showing an "internal communication failure"

Discussion in Smart-UPS & Symmetra LX / RM started by Roger , 3/21/2019 9:52 PM
Login to follow, share, and participate in this space.
Not a member?Join now
Posted in: General

Symmetra RM showing an "internal communication failure"

Subscribe to RSS
  • RogerinMD

    I recently had the batteries fail on this unit, showing a "internal temperature exceeded" which I found was due to a low center tap voltage in the sense circuit.  On restart of the unit, it will power up and report everything is operational until I turn on line power at which point it shows the internal communication failure with the RIM.  The unit will switch into bypass, but won't go online.

    Is there any additional checks or troubleshooting I can do?

    --UPDATE--

    The UPS will also come up with line power in maintenance bypass and everything plugged in but the battery.  All report OK in Diags until the battery goes in, then I get an internal communication failure and both Power Modules and the RIM report "NoCom-MIM/RIM".  The battery reports as 128V actual/120V nominal, but slowly starts draining if left in with the failure active.

    I might also add that in troubleshooting the original battery issue I pulled both the MIM and RIM at the same time, even though loads were off at the time.  I now read that this is bad.  Is there a reset procedure to get everything back in sync?

  • Terry_Kennedy

    I'm just another user, not an APC employee, but I have a lot of experience with these units.

    If this is an original Symmetra RM (the 6KVA one), everything is on a single I2C bus and if any component is hogging the bus, all sorts of exciting / unpleasant things can happen. This may also be true of newer models, but I don't have any experience with them.

    If you can take the UPS offline (no loads connected to it - if you are using the optional SYTF2 and have a spare input outlet, you can plug the SYTF2 into there to power your loads while you diagnose the UPS), I would suggest starting with the absolute minimum configuration - only the MIM, a single PM and a single battery connected. Remove the RIM, any additional power modules / batteries, and the network management and environmental monitoring cards (if installed). Power up and see if the unit behaves normally. If it does, install additional components one at a time (waiting for 10 minutes or so after each is installed to see if the UPS starts mis-behaving) until you find the one that is causing the problem. If the UPS still mis-behaves in the minimum configuration, try a different PM or battery pack (one at a time). If you have a spare MIM, you could try that as well (a unit with only a RIM and no MIM will work as a UPS if there are no other problems, but is a lot harder to diagnose). You could also try removing the communication card (with the external battery data cable connector) from the chassis, but I don't remember offhand if that is necessary for proper UPS operation. I think the UPS will run without it. Also note that some flakey hardware can cause events that completely confuse the UPS and require a complete state reset - use the manual bypass switch on the back (not from the control panel) to go into bypass, then remove the MIM/RIM, all power modules and battery packs, then re-insert in that order to get things back to a sane state, then take the unit out of bypass. This is somewhat better in newer MIM/RIM firmware, but even the latest firmware is still twitchy when presented with abnormal events. I believe the latest firmware is 213.157.W (overall UPS), 157/226 (MIM master/slave, RIM) and 311 (PM). APC does not provide firmware updates or change notes for this unit (other than purchasing new modules), though.

    In my experience, the most troublesome electronic component in this product is the SYXRCC in the external battery chassis - they tend to fail intermittently and cause the UPS to report dozens of events (some unrelated to batteries) as they send garbage over the shared I2C bus.

    The most troublesome mechanical component is the fans. Only the main chassis fan is monitored for speed - the fans in the power modules and the fans in the SYTF2 are not monitored. You can shine a flashlight through the grille to see if the fan blades are spinning or not. NOTE: Only one fan in the SYTF2 spins - the second is a backup. I've replaced every fan in multiple chassis over the years. NOTE: APC considers only the main chassis fan as a user-replaceable component - you're on your own and unsupported if you change the other ones.

    If you rebuild the SYBT2 battery packs yourself (as I do), there is a finite life to the battery cable terminals (faston F2) and I replace them every other time I rebuild a battery pack. High resistance / poor connections can cause all sorts of problems, and battery monitoring in these units is primitive by modern standards - there is only a single sense line for each pack at the half-way (60V nominal) point in the string, and all battery packs are in parallel, so if any pack is stuck saying "charge me!", the UPS will try [over] charging the other packs as it tries to bring a single battery pack up to a good state of charge.

    Here are some blog posts of mine with potentially useful information:

    Battery replacement (1)

    Battery replacement (2)

    Fan replacement

    Front panel (Failed / Fail) battery replacement

  • RogerinMD

    Thanks so much for the detailed pointers.  So here's what I got:

    SYHF6KT chassis, 2x SYPM2KU power modules, 1x SYBT2, 1x SYMIM3, 1x SYRIM3, SYCC, AP9619EM network management card

    Maintenance bypass on throughout.

    Turn off system enable, remove all, plug back in SYMIM3, SYPM2KU, SYBT2 with display into SYMIM3, system enable on, boots to 2 green LEDs on MIM, fans on SYPM2KU running, display blank.  Wait 10 min plug in SYCC, display comes up, faults to internal comm failed after initialization, still 2 green LEDs on MIM.   Tried the other SYPM2KU, same thing.

    System enable OFF, remove all, install SYMIM3, SYBAT2, SYCC, system enable on, MIM boots to 2 green LEDs, display shows only "UPS in Maint Bypass" and "No working Power Module Found" .  MIM details verify firmware at 157/226.  Plug in SYPM2KU, fans turn on, message "Power Module increased", then "internal communications failed". 

    System enable OFF, remove all, install SYRIM3, SYBAT2, SYPM2KU and SYCC, connect the display to the back jack on the SYCC.  System Enable ON - screen loads but stuck on "Please Wait..." for 5+ minutes

    System enable OFF, remove all, install SYMIM3, SYRIM3, SYBAT2, SYPM2KU, no display or SYCC.  Enable ON.  yellow LED on SYRIM for about 10 seconds, 2 green on SYMIM, then 2 green turn on SYRIM with internal bypass relay pulling in and fans turn on in the SYPM2KU.

    Feels like maybe the NXP P80C552EBA microcontroller is bad.  Is there a connection point on the I2C bus to watch?  I've got an I2C to USB converter for Arduino programming lying around. 

    This is a second hand unit that been great for home NFS and print backup power for years and I hate to see it go.  And yes, I've done my fair share of battery replacements, but never had problems with connectors (that I know of).

  • Terry_Kennedy

    I'm traveling right now and don't have a Symmetra RM handy. Your tests seem to prove that the SYCC is needed to talk to the front panel, even when the panel is plugged directly into the MIM. I didn't remember if that was the case or not.

    With no spare SYCC it will be hard to pinpoint the issue - I've amassed a large pile of various Symmetra RM components over the years to help in troubleshooting.

    I'm reluctant to tell you to throw parts at it in the hope that something fixes it, but unless someone else here has a better idea, that may be the only way to go to salvage the unit. If you happen to be in the greater New York City area, I could bring some parts when I get back. I normally have a few complete RM chassis plus boards for another 3 or 4 chassis worth on hand.

    The battery connectors tend to only be a problem if the battery has burst and leaked acid on the connector, or if excessive force is used to release the faston "catch" when removing the cable from the battery.

    I don't know if this is still available, but a few years ago you could buy a mostly-empty Symmetra RM chassis (MIM/RIM/SYCC but no power modules, batteries or transformer) from Tech Data (commercial reseller account required) for less than the cost of the AP9631 card included for free in the new chassis! I took the opportunity to "refresh" a number of my RM chassis that way.

  • RogerinMD

    Unfortunately I'm down in the DC Metro area, but thanks for the offer!

    I've taken the route to purchase some "spare parts", a SYMIM, a bare NXP P80C552EBA processor chip for the MIM, and a SYCC for a reasonable price.  I'll see if any of those does it.

    I'm fairly gentle on the battery Faston F2's, though I've had a few cells overheat and swell, but no major leaks.  I'll post a follow-up here when I resolve things one way or another.

  • RogerinMD

    No joy.  Swapping out things one at a time results the same ultimate Internal Communication Failure.  What "intelligence" is inside the power unit or the chassis that's left?  I can engage the bypass mode from the front panel and the power units will begin to charge the battery without issue taking it up to actual 138.0V/nominal 120.0V.

    Is there any debugging that can be done through the serial port on the SYCC?

  • Terry_Kennedy

    Sorry I didn't see this until now - the forum doesn't seem to email notifications. I'm traveling (again). There are some people here who seem to have documentation on the serial port - at least one member has (maybe his company sells?) a utility program that uses the diagnostic interface rather than the network card for monitoring. If we're lucky, he'll see this and reply. If not, let me know and I'll see if I can find his email and a way to get it to you (this forum only allows PMs between designated individuals).

    I don't know of any documentation outside of APC that explains anything in more detail.

    When I get back, if you're still stuck I'll see if I can pull apart one of the RM's that I "refreshed" by replacing the whole chassis. If there isn't anything obvious, I can send you the backplane and anything else with circuitry out of one of them and see if that helps.

    Edited to add: If you need to contact me, you can use the temporary throwaway address terry-apcxyz123 at tkphotos dot com

  • RogerinMD

    Terry,

    Thanks again for the reply.  I don't know if APC is paying you, but they should since they don't answer their own forums.

    I took the chassis down to bare metal to check all the boards and connections and didn't see anything abnormal except for so excessive dust buildup around the card edge connector for the AC input/bypass chassis.  I cleaned the RM backplane board and soldered all the press-fit card edge connectors just in case one of those was bad, put it all back together and still the same error.

    I found a pinout for the serial cable for the communications card and wired one up to a 2400,8,1,N,no flow serial port but get no response.  The Network interface is pulling an IP from my router (as judged by the beginning of the MAC address that APC states in their manual) if I connect that, but any navigation to it times out.  I've tried to reset that with the pinhole reset button too.

    I feel like it may be something with the power units at this point, because I have no faults (other than no power unit) until I plug one of the two in, then I get a DIAG error that says NoCom MIM/RIM between either/both of the power units and the RIM.

    At this point I'm just going to keep hacking as a matter of principle, and maybe attempt to hook up an I2C to the frame extension port and see what I can see.

    Thanks again.

  • Terry_Kennedy

    Have you tested the unit with the network card (and the optional AP9612 card) out of the smart slots in the back? I think you did, but wanted to make sure.

    When I get back I could send you a loaner SYPM2KU - it'll be one of the older ones, but should help eliminate that as an issue. It is unlikely for both to fail at once, though.

  • RogerinMD

    I did.  Starting with the chassis, MIM/RIM, 1 batt, and SYCC, it powers up, I get two green LEDs on the MIM, pulls in the main internal bypass and then 2 green LEDs on the RIM about 5 seconds later.  Battery voltage about 129.0

    Take out of maintenance bypass, and I’ve got only one error and the red front panel LED for no working power module.  I can plug the power module in and leave the latch down and no issues.  As soon as I engage the latch to release the power unit microswitch the red fault light will go out for about 3 seconds, then turn back on for the internal communication failure.  

    Weird thing is, the PM are doing something, if I leave it in, it will start charging the battery up to 138V.  If I turn on the maint bypass with the PM & battery installed it actually drains the battery.  I guess because the battery is the only source of input power for the unit at that point.

    Let me know when your back in town.  Thanks.

  • RogerinMD

    It's been a while, but an update for anyone following this thread...

    I've been poking at this thing for a while with spare time (which has been hard to come by), but started getting more aggressive with diagnostics.  After MIM and SYCC swaps to no avail, I peeled back the label on the controller inside the SYPM2KU, I found this was a Microchip PIC16F77 controller and pulled a pinout, bought a spare, and a programmer for $57.  I started with soldering on sample lines to PIN 18 and 22 of the chip (SCL,SDA of the I2C), hooked up a scope and plugged it into the rack.  I immediately saw the source of my communication failure, there was a 15V peak-to-peak 60Hz AC sine wave on what was supposed to be the data bus.  After having it in the unit for about 5 minutes, I pulled it and popped the cover and took a look with an IR camera, both TH2 and R20 were quite hot.  TH2 was wrapped in heat shrink tubing so I cut it back and found a part with "TKS PG 102" which I couldn't cross reference.  I eventually found this to be a PTC varistor to control inrush currents, presumably to be from Thinking Electronics in Taiwan, but on contacting them and sending a picture, they said this was a counterfeit part as their part would be green not gray.  Hmmmm....

    TH2 and R20

    Rather than dig further into this, I lucked out and found a SYPM2KU on eBay for $125, bought that, plugged it in, and the UPS fired right up with all the other original hardware.

    Apparently, in the power event that started this whole thing, it took out BOTH SYPM2KU power units in one shot.  At this point I'm suspecting the EEPROMs are fine, that either the PTCs or MOVs blew which is cross feeding AC into the DC bus somewhere within the SYPM2KU.

    Knowing now that it's definitively the power units, I'm planning to keep stripping the power unit down to ID the failed component that is not readily apparent with a simple visual, though I've yet to cut the heat shrink off of the MOVs, which I've had fail before in other things...

    Anyone know of a schematic for the SYPM2KU?

Page 1 of 1 (11 items)
Choose your language:  
powered by Communifire
Version 6.0.7207.29305