Profile: Develop Big Data application in the financial sector using Cloudera framework (CDH5.10), Spark (SparkSQL, MLlib), SparkR, Syslog-NG, Flume, Kafka, HDFS, Map Reduce, Yarn, HBase, Hive, ZooKeeper, Shell script, Linux, Python, Scala, Ansible, Agile methodology * Linux Shell scripts for automating deployment and configuration. Data collection/ingestion using Syslog-NG, Flume, Kafka from Production Servers into HDFS, HBase as data lake. Used Cloudera Manager to manage/monitor cluster's health and performance. * Leverage SparkSQL and MLlib in Python to do data analytics and Machine Learning on live streaming data. * Defined HBase column families and columns, created Java API for applications to access data in HBase, design/automate reporting modules * Working with Petabytes of data, use data compression for efficient storage and I/O activities and Kerberos/TLS implementation for security layer. * Experienced in data analytics and process large scale of structure and unstructured data using Spark. Familiar with both IBM's and Cloudera's Data Science Workbench. * Over 18 years of experience in software development, database design, development, management and data analysis. Experienced in full life cycle of development for enterprise level applications. * Extensive knowledge on UNIX, HP-UX, Linux as system administrator and on Relational databases of Oracle, Postgres and NoSQL database of Cassandra as database administrator. * Proficient in SQL and PL/SQL, performance tuning. Expert in computer language of Ksh, Perl and Python, Java. * Environment: CDH, HBase, Hive, Impala, Zoo Keeper, Linux (Shell/Perl), HDFS, Cassandra, Oracle 11g/12c * Experienced in Erwin modeling, ER/Studio Data Architect, Database Design tools * Excellent communication skills and good team player, experienced in leading tasks.

