June 15 - Incident cause analysis

Follow

Dear Merchant,

We have done further analysis on the cause of the disruption faced on June 15th.

What happened:

On 15 June, at 12:51, one of our keyservers stopped delivering proper encryption and decryption keys to the data processing applications due to a software malfunction.

Our incident response team received immediate alerts regarding this issue. The first analysis showed no impact on processing because backup keyserver infrastructure handled the load of the malfunctioning keyserver.

At 12:55 one of the servers handling WPF and COPYandPAY stopped properly serving responses to the merchant because it was receiving incorrect responses from the malfunctioning keyserver.

This issue affected roughly 10% of the traffic on those APIs. Other machines were unaffected. The alerts from this second issue were not separated from the alerts caused by the malfunctioning keyserver which caused the second alert to remain hidden for a short time. Therefore, we did not move to mitigate the effect on live processing immediately. At 14:18 the incident response team took the affected server out of traffic which re-established full processing capabilities.

Learnings and next steps:

  • Fix the malfunctioning keyserver
  • Increase responsiveness of applications interacting with the keyserver
  • Improve handling of multiple and simultaneous alerts

Thank you

Your Peach Payments Team

Have more questions? Submit a request

Comments

Powered by Zendesk