DESCRIPTION
Our customer exposes an API for other businesses to be able to access risks associated with a person, e.g. understand their credit score.
Behind the scene there is sophisticated decisioning system and large data volumes. Currently back-end of this API has number of legacy versions, serving hundreds of clients, with individual installations for most clients.
We need to create a data lake for one of the biggest data analytics company working with personal information both domestically and internationally.
In a nutshell, this includes replatforming of on-premise Enterprise Data Hub from Hadoop cluster into GCP. Day to day tasks include but not limited to creating spark application which manipulate with data from different sources including Oracle, Google Cloud Storage, BigQuery;
creating pipelines via GCP Dataflow; working with Jenkins and AirFlow.
Requirements
5+ years of experience as a Java Developer
Proficiency in data engineering
Expertise in Big Data : Hadoop, Spark, Kafka
Strong knowledge of Scala
Expertise in Microservices Architecture
Ability to work with high volumes of data
Experience in working with AWS
Experience in working with GCP : Dataproc (Apache Spark), Dataflow (Apache Beam), BigQuery, Cloud Storage
Good understanding of Design Patterns, Clean Code, Unit testing
Experience working in Agile environment
Data modelling skills would be a plus
Experience in Jenkins, AirFlow with Groovy and Python
Excellent written and verbal communication skills
Intermediate or higher English level, both verbal and written (B1+)
We offer
Competitive compensation depending on experience and skills
Individual career path
Social package - medical insurance, sports
Unlimited access to LinkedIn learning solutions
Sick leave and regular vacation
Partial coverage of costs for certification and IT conferences
English classes with certified English teachers
Flexible work hours