We are grateful for your continued support and loyalty. We value your feedback and appreciate your patience as we worked to resolve this incident.
Type of Event:
S2 – Visitor (PXC) API and Dashboard Performance Issues
Services/Modules Impacted:
Visitor Dashboard, web application, and API
Issue Summary/Background:
An increase in API response time and a significant spike in HTTP 500 errors were observed. Upon investigation, the logs revealed a rise in MySQL deadlocks and application server errors.
Specifically, the following error was noted in the application logs:
[HPM] Error occurred while trying to proxy request /full/app/CO-DEDP117/vm/kiosk/instances from app.example.com to https://api.example.com (ECONNRESET) (Errors | Node.js v23.11.0 Documentation)
These issues affected the overall application performance and user experience.
Root Cause:
High latency and increased MySQL deadlocks.
Remediation:
While reviewing MySQL and application server logs, the team identified an increase in deadlocks in MySQL logs and application connection resets. To mitigate the issue, the Eptura CloudOps team performed a database. The restart helped in returning the application functionality to normal. We also noticed that API response times significantly improved and the HTTP 500 errors decreased substantially.
Timeline:
All times listed in CEST
9:22 a.m: A large spike in 5xx HTTP response errors and elevated response times from API endpoints were observed.
2:56 p.m.: An incident was reported, prompting Eptura to initiate the investigation.
3:51 p.m: Eptura updated the Visitor status page to reflect the incident and investigation.
4:57 p.m.: The Eptura Infra team identified the root cause of the issue and suggested that the response times have now been stable.
5:53 p.m: The Eptura team updated cases to ask customers for initial feedback on the issue.
6:22 p.m: The Eptura team updated the status page confirming the issue is resolved.
Total Duration of Event:
1 Hour 12 minutes
Preventive Actions:
To mitigate future occurrences, we have scheduled a proactive MySQL service restart every Monday before working hours to maintain database stability.