Hi,
We have several IP7000 phones, and all of them suffer from occasional resets, usually during calls. All are running UCS 4.0.4, and connecting to a NEC SV8100 server.
The app log from one of them is attached, and shows the problem:
0923022228|wdog |*|03|Watchdog Expired: tSup
The CPU usage dump suggests that this process has swamped the CPU for 5 seconds, and so the watchdog restarts the unit.
Is there any way we can identify what happened, or any way to tune the watchdog settings to be less aggressive?
Thanks,
Paul.
Hello Paul,
welcome to the Polycom Community.
Based on the Mac Address in the log I have seen that you already tried to raise this directly with our Support. You will need to contact the Reseller that you have been provided with in order to raise this issue as we need the configuration of the phone.
Please work with them so someone from our Support Team in the UK can look into this for you.
Please ensure to provide some feedback if this reply has helped you so other users can profit from your experience.
Best Regards
Steffen Baier
Polycom Global Services
If official support is required please check how to phone or open a case here
----------------Steffen,
Thanks for the response, but there's not much point in having a community forum if the standard response is to contact support, and the support response is please buy a support contract...
Paul.
Hello Paul,
the Community does not follow any SLA's and is not a support community to replace our established support infrastructure.
As advised by our Tier 1 team you need to work with your Reseller who you have purchased the phones from.
They are able to raise a ticket with our support. In case this fails please contact me via community mail and we find a way around this.
Best Regards
Steffen Baier
If official support is required please check how to phone or open a case here
----------------Hello Paul,
please check your community mail.
Best Regards
Steffen Baier
If official support is required please check how to phone or open a case here
----------------Hello Paul,
We're having this exact problem and I believe it has lead to the death of several units (through repeated crashing causing file system corruption). I do actually have a ticket open with Polycom engineering right now so I'll be happy to post some results when I have them.
-Sam
Hello Sam,
Paul's initial issue was a to short NTP refresh cycle. Once we corrected this and set the logs back to normal logging the Units have been fine so far (Ticket is closed and I had no new feedback).
The File System issue you may experience happens on a few SSIP7000 running UCS 4.0.5 or UCS 4.0.6 and should not affect any units running UCS 4.0.4 or older.
If you share your ticket number with me I can check if this is in my own region.
Best Regards
Steffen Baier
If official support is required please check how to phone or open a case here
----------------Steffen,
My ticket number is 1-527195191. I'm guessing you don't think the issue is the same between 4.0.4 and 4.0.5. But out of curiosity, what NTP settings were adjusted to correct the issue? Probably wouldn't hurt for me to try the same fix while I'm waiting for engineering to look at the logs and do a F/A on the units I sent back. I am running 4.0.5C on these units.
My apologies to Paul for hijacking his thread.
-Sam
Hello Sam,
I think in this case the NTP refresh was down to 300 seconds or so.
Please ensure to provide some feedback if this reply has helped you so other users can profit from your experience.
Best Regards
Steffen Baier
Polycom Global Services
If official support is required please check how to phone or open a case here
----------------Just an FYI that Polycom engineering advised trying an NTP resync value of 86400 seconds (one day). This does seem to have helped - just had to replace 7 bricked phones before we got this setting in place. I suspect they bricked because the repeated reboots - and crashes, and sometimes crashes while dumping from the previous reboot - cause file system corruption.
Anywa, it appears that lower values for the NTP resync value are likely causing a thread race condition, ultimately spiking the CPU and causing the tSup process to crash. When tSup crashes, Watchdog reboots the phone. This is just conjecture as I haven't heard back from engineering on the exact cause.
If this is a thread race condition issue, I wonder whether the longer sync value actually fixes it or if it just makes it occur less frequently. In other words, is the issue really fixed by the larger resync value or did I just make the issue slightly less prevelent? Obviously, I'd prefer the former. I don't want any more customer complaints about rebooting or dead IP 7000 phones.
-Sam