In case you have not noticed. We are down.
First and foremost. There is no data loss. There is/was no security exposure. We are doing an upgrade 🙂
I’m not going to butter this up. The extent of this outage was not planned and we know it is frustrating for you. I contemplated doing this in video form but I have not slept in a bit and my fear is my current appearance may not be very CEO like. While I am not sure this post is going to be received as being very CEO like, I’m going to make it anyway.
In all seriousness we were unaware of a limitation in a software / hardware combination and because of that, we have unplanned downtime. We owe you an explanation and next steps. Ultimately this combination was my decision and in that respect, I am the one who owes you the explanation.
- Dwolla’s database is getting really big really fast.
- We made a small change yesterday and we planned to be updating for about 30 minutes.
- During a transfer we found a limitation in the operating system we run on.
- The transfer limitation meant a big upgrade. The upgrade took a better part of all night. Big upgrades do not happen very fast.
- Now the data needs to be re-attached to the upgraded and re-configured operating system, and run through the paces. With things of this complexity. It takes time.
What is currently happening?
- Our resources have shifted to assisting our hosting partner with the resolution and next steps.
A few things
- This is a OS/hardware issue.
- There is no data loss. There is/was no security exposure.
- This is something we are solving and know how to deal with in the future. In the mean time we simply have to work through it. While it is unexpected and it is stressful, it needs to be recognized and dealt with.
- This is not as simple as just throwing a bunch of data on the Amazon cloud or something to solve. We have a highly customized hosting environment for a reason and unfortunately due to this design we can’t just plop it over onto another computer. It has to be done very carefully by the right people.
- We can not speed it up right now. There is no faster or easy button for what we are doing.
What we are doing to remedy the situation in the future
- Modifications to the existing environment that deal with an OS limitation.
- Groundwork laid for another environment (which will take considerably more time)that is on the docket for first quarter.
When we will be back up
- Tonight. What we are doing right now is transferring massive amounts of data and we are doing that as quickly as humanly possible but as safely as humanly possible. I’ve made a choice not to lean into people to make them frantic and nervous. When people are frantic and nervous they make mistakes and the data is too precious. This needs to be done the right way even if typing this stinks… I feel like this is the right way to handle it.
If you want to scream at anyone or just generally go off on someone. I would ask that you do it to me. Ultimately all the choices that led us here I signed off on and I am not going to sit here and point the finger at other people. This is on me. Our staff is working hard and smart. I am frustrated and I am upset (like many of you) but my energy now needs to stay with assisting with the resolution.
Please feel free to reach out to me directly. My e-mail is firstname.lastname@example.org.