How Kafka Is so Performant If It Writes to Disk?
How traditional data transfer works, what is a Zero Copy optimization and how Kafka benefits from it when combined with the Page Cache.
How traditional data transfer works, what is a Zero Copy optimization and how Kafka benefits from it when combined with the Page Cache.
How to rescale a running Flink job? Rescaling is useful to better use computational resources when your application does not have the same workload at all ti...
Covering Spark’s default and Databrick’s Transactional Write strategies used to write the result of a job to a destination and guarantee no partial results a...
Comparing standard and memory optimized Excel generations using Apache POI library
In this post I’m sharing my feedback and some preparation tips on the CRT020 - Databricks Certified Associate Developer for Apache Spark 2.4 certification ex...
If you are using the HttpClient library of version 4.5.2 to make HTTP requests to a backend server with SSL and SPNego, and the requests are unexpectedly fai...
A step-by-step guide on how to implement custom Spark Evaluators in StreamSets
Have you ever wondered how Spark uses Kerberos authentication? How and when the provided through the spark-submit –principal and –keytab options are used? Th...
This post detailedly explains and presents a workaround solution to a problem with HBase authentication in long-running Spark 2 applications.
A step-by-step guide on how to process change data capture (CDC) events with StreamSets, using its Oracle CDC Client and delivering to CRUD and non-CRUD dest...
In this post you will see how Kerberos authentication with pure Java Authentication and Authorization Service (JAAS) works and how to use the UserGroupInform...
Are you interested in taking the CA175 certification? Here goes my feedback on exam structure, exam environment and practical exercises you can do to prepare...