Transaction gateway: Partial outage
Incident Report for Clearhaus
Postmortem

We are sorry for the transactions your customers could not complete during the time span our service was not fully functional. Our upstream service provider has determined that an unfortunate chain of events led to the incident. While redirecting traffic from their primary to their secondary site in preparation for maintenance, they were not aware that one or more network ports on the secondary site were out of order, causing traffic to be lost. Under normal circumstances they would have been alerted about the condition, however, alerting for the secondary site were experiencing various issues leading up to the incident. They are aware that the chain of events is completely unacceptable and have introduced additional checks and balances to prevent it from occurring in the future.

As noted in the incident updates, we discovered a way to improve transparency of such incidents, namely to display failing transactions in our dashboard. This improvement is currently being implemented and is intended to be in place within a few weeks. The investigation of this incident identified additional, potential improvements (regarding the uncertainty about the transaction state of a failed transaction) which even touches upon extending our API with “idempotency”. If you are a partner and have an opinion or request in regards to this, please feel free to reach out to us, perhaps even through an issue on our public documentation.

Sincerely,
your operations team

Posted Jun 24, 2020 - 13:29 UTC

Resolved
We have been monitoring transactions vigorously and have gotten confirmation from our service provider that the issue is remedied and they are processing normally.

We have initially confirmed that our fail-over mechanisms and manual actions to mitigate were appropriate, but we will continue to find potential improvements.
Posted Jun 16, 2020 - 14:02 UTC
Monitoring
Our service provider has confirmed the issue and we are actively working to understand what caused it to occur.

Please be advised that the failing requests do not appear in the dashboard; however, the HTTP responses were appropriately returned.
Posted Jun 16, 2020 - 12:40 UTC
Investigating
Between 2020-06-16 11:06:22 UTC and 2020-06-16 11:28:24 UTC authorizations and voids failed with status code 50000.

We are continuously investigating and will post an update as soon as possible.

For now, service has been restored.
Posted Jun 16, 2020 - 11:46 UTC
This incident affected: Payment Processing APIs (Gateway API (gateway.clearhaus.com)).