With some degree of skepticism, this announcement is almost unbelievable. The worlds most scalable columnar NoSQL database might have just gotten a lot faster. Apache Cassandra, already a heavyweight in the world of Big Data has been rewritten using C++. A new lock-less shared nothing architecture has been leveraged, resulting in incredible performance and low latency. It’s API compatible with Cassandra, so all existing code and tools should work without modification. Only time will tell if these claims are really as good as it seems, but it may be a dream come true for many data engineers and data scientists. For more information, check out ScyllaDB’s website at http://www.scylladb.com/.
New release of Apache Spark includes large improvements in performance and functionality including:
- Significant expansion of DataFrame functionality with over 100 new functions
- Integration of Project Tungsten with massively improved performance and response consistency (eliminating or reducing JVM GC pauses)
- Improved Python support, bringing API compatibility much closer to Scala and Java
For more information check out https://spark.apache.org/news/spark-1-5-0-released.html.