Posts by Year

2021

Back to top ↑

2020

Rescaling in Flink

How to rescale a running Flink job? Rescaling is useful to better use computational resources when your application does not have the same workload at all ti...

Transactional Writes in Spark

Covering Spark’s default and Databrick’s Transactional Write strategies used to write the result of a job to a destination and guarantee no partial results a...

Back to top ↑

2019

CRT020 Certification Feedback & Tips!

In this post I’m sharing my feedback and some preparation tips on the CRT020 - Databricks Certified Associate Developer for Apache Spark 2.4 certification ex...

How Spark Uses Kerberos Authentication

Have you ever wondered how Spark uses Kerberos authentication? How and when the provided through the spark-submit –principal and –keytab options are used? Th...

Processing Oracle CDC with StreamSets

A step-by-step guide on how to process change data capture (CDC) events with StreamSets, using its Oracle CDC Client and delivering to CRUD and non-CRUD dest...

Authentication using Kerberos

In this post you will see how Kerberos authentication with pure Java Authentication and Authorization Service (JAAS) works and how to use the UserGroupInform...

CCA175 Certification Feedback & Tips!

Are you interested in taking the CA175 certification? Here goes my feedback on exam structure, exam environment and practical exercises you can do to prepare...

Back to top ↑