SPROCKET Automated SAS To Spark Migration

Is your business ready to ditch your arcane and proprietary analytics platform developed in the days of punch cards? Are those insanely expensive renewal costs driving you crazy? Does your organization need an analytics platform to provide better functionality, usability, performance, and scalability? Ready to make the switch to Apache Spark, but have lots of legacy SAS code critical to your business? You have arrived at the right place.

Apache Spark is the de facto analytics platform in the market. It has comparable, and often superior functionality for most analytics and data science use cases. PySpark (the Python API for Spark) is simple, flexible, and easy to learn. Python is a high level language which produces simple easy to read code, and has been described as a the most natural programming language for the way people think. The overall Python language provides a scripting language for building reusable processes (similar to SAS Macro’s), while the PySpark API provides all the necessary data manipulation, ETL, and analytical tools necessary comparable to data steps and proc’s. Some of the key benefits of moving from SAS to PySpark are:

  • Reduction or elimination of SAS renewal costs and licence restrictions
  • Vastly improved performance with pipelined in-memory transformations and analytics
  • Scalability, Fault Tolerance and High Availability
  • Developer availability – 10x more Python programmers
  • Easy to use Cloud platforms (Databricks, Azure HD)
  • Access to the vast array of relevant Python libraries (Web Scraping, array/matrix processing, scientific libraries, image processing, etc.)
  • Access to the latest machine learning and analytical algorithms
  • Ability to handle structured and unstructured data (no 32KB limits)
  • Full and complete documentation with a very active user community

Performing in-house SAS to Spark conversions is costly and time consuming, with no guarantee of a positive outcome. Why develop skills in-house, when the ultimate goal of that skill is to make itself obsolete? That’s why WiseWithData is proud to be the world’s first and only packaged solution for migrating SAS based ETL and analytics jobs into the modern world of Apache Spark. Our decades of knowledge and expertise with SAS and PySpark, combined with over 4 years of focused development have forged SPROCKET, the world’s most advanced code conversion engine. SPROCKET automates conversion of most SAS code patterns to PySpark. Using SPROCKET, we can quickly, consistently and accurately convert SAS code to PySpark.

One of the key advantages of our solution is that the converted PySpark code maintains the original structure and workflow, with a nearly line-by-line conversion, simplifying testing and knowledge transfer. SPROCKET delivers exceptional benefits for our clients:

  • Unmatched code consistency and quality
  • PEP8 compliant code style
  • Rapid delivery and testing
  • Accurate delivery timelines
  • Consistently accurate end-results
  • Integrated performance optimizations for improved run-time performance
  • Incorporation of SAS language concepts – lowering the learning curve for existing SAS users

For more information on this exciting offering, please contact us at inquiry@wisewithdata.com.