EXECUTIVE SUMMARY
Herff Jones’ foundational application was built on an aging platform, failure-prone infrastructure which slowed business and product innovation. Herff Jones implemented a modern DevOps approach and transformed their outdated system into a cutting-edge SaaS solution running 100% on AWS. Herff Jones built automated, flexible development pipelines with DevOps to enable the potential for daily releases and implemented quality controls and automation to ensure stability.
Automating the infrastructure and release pipeline allowed Herff Jones to scale their development capacity. The development approach allowed Herff Jones to develop with the customer experience in mind. The result was a more efficient, reliable, and scalable platform that helped to drive their business forward.
THE BURNING PLATFORM
Their core product, eDesign, was running on an outdated 14-year-old flash-based system prone to failure, unsustainable, and nearing the end of life. They hit a crisis point where Herff Jones needed to decide whether they would invest in their product or let this part of their division perish.
Herff Jones was experiencing numerous problems, such as production issues arising on deadline days, technical bugs being left unresolved for months, technical debt build-up, outages that could last multiple days, and the inability to provide new features to their customers consistently.
The approach to product development and innovation was significantly lacking as they were continually trying to keep the system operational. These technical obstacles made it difficult for them to stay ahead of their competition, resulting in an undependable customer experience.
DevOps was not a priority in their existing system, so as CSTG engaged to Migrate and Modernize their platform, we determined a new modern DevOps program was needed from AWS automation to CI/CD pipeline management.
TECHNOLOGIES
& SERVICES USED
DevOps
- Ansible
- Bitbucket
- Docker
- Jenkins
- Kubernetes
- New Relic
- Solarwinds
- Splunk
- Terraform
AWS SERVICES
- ALB
- AWS ECS
- Amazon MQ
- Amazon RDS Oracle
- Auto Scaling
- CloudFront
- CloudTrail
- CloudWatch
- Database Migration Service (DMS)
- EC2
- Elastic Container Registry (ECR)
- EventBridge
- IAM
- Lambda
- Route 53
- S3
- SNS
- Systems Manager
- Transit Gateway
- VPC
APPLICATION DEVELOPMENT
- Angular
- Canvas.js
- Hibernate
- Java
- Spring
- Spring Boot
- Typescript
MAJOR CHALLENGES & PROBLEMS TO SOLVE
• Build a reliable and consistent production environment that would be maintained with future releases.
• Improve the ability to deploy features to production sooner than every 6 months.
• Provide a scalable, predictable, reliable non-production environment to enable feature development, improve quality, and encourage innovation.
• Create a standard process for Infrastructure, micro-services, code, and databases to be deployed.
• Apply governance to environments to optimize security and costs.
• Embed DevOps into the overall process vs an isolated area of the business.
SOLUTIONS
CleanSlate built a fully automated and scalable AWS infrastructure for development, testing, performance, staging, and production extensively utilizing Terraform. The development environments were automated and could scale to as many development teams as we wanted – maxing out at 12 for this project. Within this automation, the entire application, including the Client, Service, AWS Services, Integration, and Infrastructure Tiers were all automated and coordinated into the overall deployment process.
All applications were refactored to leverage configurable, lightweight, standardized Docker containers. The build & deploy processes CI/CD pipeline were used across all environments to ensure consistent and repeatable deployments. Deployment automation was integrated with the Operations Team to provide monitoring and alerting notifications.
This large and complex environment required many tools and integration of those tools to maintain the proper environment process and configurations to achieve the proper reliability, quality, and observability from application to infrastructure.
Governance was embedded into the overall DevOps program from observability to cost optimization. CSTG built in monitoring to ensure the application was performant through tools like Amazon Cloudwatch, Splunk, New Relic, and Nagios. Having this level of tooling allowed for self-correcting, scale, and faster troubleshooting of the production application and infrastructure. CleanSlate also put a governance model in production and non-production environments.
A cost-saving approach to production, we incorporated auto-scaling, to scale as workloads increased and scaled down as they went back to standard capacity. Also, on non-production, all environments would spin down during non-working times and spin up during working hours to ensure that they only paid for what they were using.
With a complex environment of multiple non-production and production environments a development and release management process needed to be established to ensure a high level of quality to production releases and a long-term sustainable product to market. Below you can see the robust development process and how infrastructure, DB, code, and all components were managed at each stage, all in automated ways or human reviews through defined automated workflows.
Having this level of tooling allowed for self-correcting, scale, and faster troubleshooting of the production application and infrastructure. CleanSlate also put a governance model in production and non-production environments.
SUCCESS METRICS
CleanSlate built Herff Jones a very robust and automated solution that incorporated quality and predictability
into the environment. Some success metrics for the business as part of this process:
- Feature Releases improved from every 6 months to On Demand
- Cost Optimization on Non-Production environments saved around 70% on non-production costs.
- Overall, Application quality and reliability improved by focusing on quality processes throughout the development and DevOps processes.
- Development teams scaled by 10X which allowed for faster time to market due to automated development environments.
- Features and POC were now possible due to the automated scale of a development/POC environment.
- Overall downtime significantly improved as the legacy system was down or performance degradation monthly, the current production environment has had zero outages since launch.