How Katz, Sapper & Miller Used Databricks Lakehouse Accelerator on AWS to Get AI Ready

Learn how CleanSlate accelerated KSM’s use of Databricks Lakehouse Platform on AWS to Achieve a Modern, Scalable, and Secure Solution in 10 Weeks

EXECUTIVE SUMMARY

Katz, Sapper & Miller (KSM), a top U.S. accounting and advisory firm, sought to modernize its data infrastructure to better support growth, acquisitions, and AI readiness. Legacy systems created data silos, inconsistent reporting, and limited real-time insights, slowing innovation and decision-making.

Partnering with CleanSlate Technology Group, KSM adopted the Data Lakehouse Accelerator on AWS, replacing legacy SSIS and Power BI systems in just 10 weeks. This transformation established a centralized, scalable, and governed data foundation designed for analytics, automation, and AI experimentation.

CleanSlate built reusable Infrastructure-as-Code (IaC) patterns, automated ingestion and transformation pipelines, and trained KSM’s internal team for long-term management. The result is a secure, unified platform that accelerates innovation, enhances governance, and positions KSM for continued growth through advanced data intelligence.

Katz, Sapper & Miller (KSM) is a leading Indianapolis-based advisory, tax, and audit firm that offers a broad range of professional services to national and international organizations. The company was founded in 1942 and has since grown to be one of the 50 largest independent CPA firms in the United States. As the largest firm in Indianapolis, KSM currently employs 700+ employees and was recently ranked #7 among the 100 Fastest-Growing U.S. Accounting Firms.

KSM was utilizing a legacy data lakehouse that was not able to keep up with the demands of expansion, specifically due to new acquisitions. The acquisitions exposed the old process as an obstacle to growth and increasing complexity, forcing KSM to re-think its approach.

KSM was also struggling to consolidate and use its data effectively to enable AI. With data silos and an outdated legacy architecture, KSM had limited access to real-time insights which led to slow decision-making and innovation. The company needed a foundation with patterns that they could build on, scale, and could handle a variety of sources—a single source of truth for analytics. They also needed a modern lakehouse to deliver an optimal customer experience with configurable data-driven pipelines. This would allow them to begin leveraging and monetizing their valuable data as an asset in the cloud.

KSM chose CleanSlate Technology Group, a leading provider of professional cloud services for Amazon Web Services (AWS), to help them in their journey to modernization. CleanSlate deployed their Data Lakehouse Accelerator to replace KSM’s legacy use of Microsoft SSIS/Power BI with Databricks functionality in just 10 weeks. This modern infrastructure included a production-ready framework with integrated data ingestion, transformation pipelines, and governance strategies—ultimately a unified approach for data and AI.

challenges

Siloed Legacy Systems, Limited AI Readiness

There were a few challenges that CleanSlate initially faced when determining the best strategy for KSM. These include:

 

  • Siloed data sources and inconsistent reporting.

KSM’s information was stored in separate departments and systems and was not accessible to the rest of the organization. This made it difficult for KSM to share information, collaborate effectively, and make informed decisions. With different teams using incomplete data this led to inaccuracies and confusion.

 

  • Lack of a formalized, centralized data foundation.

KSM’s underlying infrastructure, strategies, and processes for managing data was not formalized or centralized which led to inconsistent data and inefficient operations. From how data is collected and stored to ensuring its quality and accessibility, there was lots of room for improvement by moving to an AWS infrastructure.

There also wasn’t a standard policy or clear governance for how data is handled, which led to unreliable information.

 

  • Desire to move toward AI/ML-enabled insights.

KSM wanted to shift its business strategy to rely on data-driven recommendations generated by artificial intelligence (AI) and machine learning (ML). Instead of relying on traditional analytics, spreadsheets, or human intuition, they wanted to take advantage of advanced technology to uncover patterns, predict outcomes, and automate decision-making from large sets of data.

By doing this, KSM can become more proactive and competitive by turning complex, raw data into clear, actionable intelligence. CleanSlate needed a clear plan to help them accomplish this.

 

  • Complex data transformation and ingestion needs.

KSM was in need of a system to collect, clean, and convert data from multiple, diverse sources into a usable format for analysis. This was a challenge due to the volume, variety, and velocity of the data. This required handling issues like integrating different data formats, ensuring data quality, and setting up robust processes to handle potentially large volumes of information efficiently.

Solutions

Databricks Lakehouse Accelerator

CleanSlate delivered a complete system for generating time entry reports using two key cloud services: Amazon Web Services (AWS) and Databricks. This solution not only established repeatable data processing patterns required to begin migrating Time Entry reporting and future data to the new Lakehouse, but it also enabled new analytic and AI/ML capabilities. CleanSlate handled the entire process, from collecting data to creating the final reports. Here is an overview of the solutions CleanSlate provided:

 

  • Replaced legacy use of Microsoft SSIS/PowerBI with Databricks functionality.

By migrating to Databricks, KSM could now have a single, collaborative workspace for all data teams—engineers, analysts, and scientists. They no longer needed separate tools like SSIS for data movement and Power BI for reporting, which streamlined their entire workflow. Through Databricks’ AI/BI Genie, KSM could generate booking data reports automatically. This process bypassed the traditional, time-consuming method of hiring or relying on Power BI developers to create custom reports manually.

KSM’s legacy process needed to continue to exist as they moved forward with modernization. CleanSlate provided a hybrid approach that allowed them to do just that, bringing on new sources and analytics while maintaining the existing BI solution.

 

  • Developed scalable data pipelines for ingestion and transformation.

Through AWS and Databricks, CleanSlate built an automated system to efficiently move and process large amounts of data, which can automatically handle increased data volume without performance issues. This process involves ingestion (collecting raw data) and transformation (cleaning and preparing it for analysis), all designed to scale up as needed.

 

  • Trained internal team to manage Lakehouse.

CleanSlate wanted the KSM team to be able to manage the Lakehouse on its own long after the project is complete. To make this possible, CleanSlate invested time in training the internal team on how to manage the Lakehouse so they would be the experts on their own data. The training was customized to focus on the KSM’s unique priorities and use cases for the lakehouse.

 

  • Created reusable patterns using Infrastructure-as-Code.

CleanSlate used Infrastructure as Code (IaC) principles to create reusable, standardized templates for common infrastructure setups. By defining infrastructure in code, KSM can reuse these patterns across different projects or environments, which saves time, reduces errors, and ensures consistency. CleanSlate delivered a data lakehouse platform with two functional use cases. This included object hierarchy design, role based access, ingestion patterns, data quality enforcement, a dimensional model for consumption, and connectivity with Power BI.

 

Technologies & Services Used

Results

AI-driven business growth

CleanSlate delivered the solution KSM was looking for: a modernized data lakehouse that combines AWS and Databricks Lakehouse to improve scalability and security. This innovative platform positions KSM for success both now and in the future. Here are a few results that were delivered:

 

  • Centralized and scalable data foundation

    • Pre-Built Accelerator for rapid deployment
    • Automation built in
    • Batch and streaming use cases supported
    • Automated stand up of Lakehouse
    • Bronze & Silver Medallion layers in AWS or Azure
    • Seamless data integration pipelines
  • Ready for experimentation in AI/ML.

With this new infrastructure, data pipeline, and workflow in place, KSM can rapidly test, train, and manage different artificial intelligence and machine learning models. CleanSlate equipped KSM with the tools they needed to explore new ideas and move promising models into production quickly and efficiently.

KSM wanted a genAI foundation for natural language analysis, using their own business terminology. CleanSlate helped put the foundation in place to do just that.

  • Increased reporting efficiency and governance.

    • Full monitoring and observability
    • Dashboards for cost job execution and data quality
    • Databricks Unity Catalog with Role Based Access
    • Data Quality with ML Models
  • Established organizational structures and roles, enabling governed growth and observability.

Through this new foundation CleanSlate helped implement, KSM could now expand in a controlled, deliberate, and secure way through clear processes, decision-making rules, and risk management strategies.

  • Project completed on-time in 10 weeks.

By completing the project on-time and quickly, KSM can achieve improved performance and cost savings through one platform with lower storage and compute costs.

More Success Stories

Let’s talk

Thinking about moving to the cloud or assessing your current cloud environment? You probably have questions. We’ll get you the answers you need to help you make the right decision.

///fade header in for single page posts since no hero image