Post by Charles McIntyre, UCSC. The UC NeXt group, an ITLC subcommittee, wants to discover and connect the pockets of innovation and collaboration across UC. We have a survey so all UC IT staff can participate and share their success stories and lessons learned.
Yes, there are prizes! Each week for three weeks we are randomly drawing winners from the respondents. Two this week, three next week, and four the last week. The prize is tickets to the annual UC Computing Services Conference (UCCSC). We’ll also be reading all the entries and highlighting our favorite ones on this blog.
The first drawing winner is Chris Thompson, UCLA. Congratulations, Chris, and see you at the UCCSC.
We picked the Cloud Archival Storage Service (CASS), also at UCLA, as our favorite innovative solution. It’s a cost-effective, simple, and enterprise-grade solution to a common operational requirement for backups. I asked Scott Friedman, primary CASS engineer, a few questions:
What was the driver for this solution?
Affordable storage. Originally, it was meant to be for research purposes but it turns out everyone needs storage. Affordability and flexibility.
CASS is really Cloud NAS – cloud network attached storage – and it turns out this is something that many people at UCLA, and now a few other places around UC, need and want. We wanted an enterprise-level system so we used our buying leverage from doing HPC for so long to build something at a cost that allows us to approach Amazon Glacier pricing.
How did you come to this specific architecture?
The system needed to be enterprise grade, resilient to failure, and as automated as possible. Resilient to failure is essential because we need to minimize our support exposure to keep costs low. This way we can have a failure and the system keeps running and no one (users) even notices. It allows us to batch up any repairs and make them on the system live (most of the time) so we have very little downtime. Automation is important for the same support reasons – the more that’s automated, the more consistent the system is, and the less twiddling we have to do.
What factors contributed to your success?
The flexibility expressed by the administration here at UCLA. The provost asked for entrepreneurial solutions. We needed seed funding (a loan) to get this going and they came through. We needed tiered rates (because that’s what we all expect) and they accommodated us. Another is the goodwill of the research and IT staff that are using CASS. Also, CASS is completely self-funded. That gives it longevity – necessary for a system that is, at least in part, meant to be archival.
Anything else?
I am currently looking into setting up a DR solution to go with CASS. Right now all data is stored at UCLA in two separate data centers. Many southern California folks would like to get their data “up north” and, potentially, vice versa. We are also working with UCLA Health and the medical school on HIPAA compliance (you can use CASS now for HIPAA, it’s just more complicated). Finally, I am working on options so that remote (non-UCLA) users can use CASS better through system optimizations to improve throughput and hide latency.
Congrats to CASS!