The Data Engineer position will focus on designing, developing, and supporting
our Hadoop data solutions in Spark and Python (PySpark) while working with
other components of the Hadoop ecosystem such as HDFS, Hive, Hue, Impala,
Zeppelin, Jupyter. A successful candidate will work closely with business and
portfolio leads to understand requirements then design and build innovative
Job Duties & Responsibilities.
Design and development centered around PySpark, Python and Hadoop
Working with gigabytes/terabytes of data and must understand the
challenges of transforming and enriching such large datasets.
Provide effective solutions to address the business problems – strategic
Collaboration with team members, project managers, business analysts and
QA teams in conceptualizing, estimating and developing new solutions and
Work closely with the stake holders to define and refine the big data
platform to achieve sales, product, and strategic objectives.
Collaborate with other technology teams and architects to define and
develop cross-function technology stack interactions.
Read, extract, transform, stage and load (ETL) data to multiple targets,
including Hadoop and Oracle.
Ingest and streamline incoming files of various layouts/formats as part of
Source Prep process.
Develop scripts around Hadoop framework to automate processes and existing
Modify existing programming/code for new requirements.
Estimate work, and track progress through SDLC with JIRA/Confluence
Unit testing and debugging. Perform root cause analysis (RCA) for any
Convert business requirements into technical design specifications and
execute on them.
Participate in code reviews and keep applications/code base in sync with
version control (GIT/Bitbucket).
Effective communication, self-motivation, and ability to work
independently while remaining fully aligned within a distributed team
Bachelor's or Master's degree in Computer science (or Engineering
3+ years of experience with big data ingestion, transformation and
Analysis, design and implementation experience with Hadoop distributed
frameworks, including Python & Spark (SparkSQL, PySpark), HDFS, Hive,
Impala, Hue, Cloudera Hadoop, Zeppelin, Jupyter, etc.
Extensive experience handling large volumes of data (measured in
Terabytes/Billions of Transactions)
Proficient knowledge of SQL with any RDBMS
Familiarity with RDD and Data Frames within Spark
Working knowledge of data analytics
Troubleshooting and complex problem-solving skills
Knowledge of Oracle databases and PL/SQL
Working knowledge of Linux/Unix environments and comfort with Unix Shell
scripts (ksh, bash)
Basic Hadoop administration knowledge.
DevOps Knowledge is an advantage
Ability to work within deadlines and effectively prioritize and execute on
Strong communication skills (verbal and written) with ability to
communicate across teams, internal and external at all levels
Working knowledge of Oracle databases and PL/SQL.
Hadoop Admin & Dev-Ops.
ETL Skills (Familiarity with Talend or other ETL tools a plus.
Good analytical thinking and problem-solving skills.
Ability to diagnose and troubleshoot problems quickly.
Motivated to learn new technologies, applications, and domains.
Possess appetite for learning through exploration and reverse engineering.
Strong time management skills.
Ability to take full ownership of tasks and projects.
Team player with excellent interpersonal skills.
Good verbal and written communication.
Possess Can-Do attitude to overcome any kind of challenges.
Preferred Certifications (Any of these)
CCA Spark and Hadoop Developer.
MapR Certified Spark Developer (MCSD).
MapR Certified Hadoop Developer (MCHD).
HDP Certified Apache Spark Developer.
HDP Certified Developer.
Epsilon is a global advertising and marketing technology company positioned at
the center of Publicis Groupe. Epsilon accelerates clients' ability to harness
the power of their first-party data to activate campaigns across channels and
devices, with an unparalleled ability to prove outcomes. The company's
industry-leading technology connects advertisers with consumers to drive
performance while respecting and protecting consumer privacy. Epsilon's
people-based identity graph allows brands, agencies and publishers to reach
real people, not cookies or devices, across the open web. For more
information, visit epsilon.com.
When you're one of us, you get to run with the best. For decades, we've
been helping marketers from the world's top brands personalize experiences for
millions of people with our cutting-edge technology, solutions and services.
Epsilon's best-in-class identity gives brands a clear, privacy-safe view of
their customers, which they can use across our suite of digital media,
messaging and loyalty solutions. We process 400+ billion consumer actions each
day and hold many patents of proprietary technology, including real-time
modeling languages and consumer privacy advancements. Thanks to the work of
every employee, Epsilon has been consistently recognized as industry-leading
by Forrester, Adweek and the MRC. Positioned at the core of Publicis Groupe,
Epsilon is a global company with more than 8,000 employees around the world.
Check out a few of these resources to learn more about what makes Epsilon so
Our Culture : https: // www. epsilon.com/us/about-us/our-culture-epsilon
Life at Epsilon : https: // www. epsilon.com/us/about-us/epic-blog
DE &I: https: // www. epsilon.com/us/about-us/diversity-equity-inclusion
CSR : https: // www. epsilon.com/us/about-us/corporate-social-
Great People Deserve Great Benefits
We know that we have some of the brightest and most talented associates in the
world, and we believe in rewarding them accordingly. If you work here, expect
competitive pay, comprehensive health coverage, and endless opportunities to
advance your career.
Epsilon is an Equal Opportunity Employer. Epsilon's policy is not to
discriminate against any applicant or employee based on actual or perceived
race, age, sex or gender (including pregnancy), marital status, national
origin, ancestry, citizenship status, mental or physical disability, religion,
creed, color, sexual orientation, gender identity or expression (including
transgender status), veteran status, genetic information, or any other
characteristic protected by applicable federal, state or local law. Epsilon
also prohibits harassment of applicants and employees based on any of these
protected categories. Epsilon will provide accommodations to applicants
needing accommodations to complete the application process.