Sanjay Das
sanjay@sanjaydas.com | 732-443-0838 | Long Branch, NJ, USA | US Citizen
Blog | LinkedIn | Github | Stackoverflow | Download
Dedicated and results-driven Engineer/ Leader with over a decade of experience in architecting and implementing robust Data platforms and solutions. Proven track record of success in developing and deploying innovative GenAI projects (listed at the bottom). Adept at leveraging cutting-edge technologies for scalability, performance, and reliability.
Education: M. Sc (Tech) Computer Science, First Class
Birla Institute of Technology and Science (BITS), Pilani, India
Work Experience:
Staff Engineer /Director - Data Engineering, AdMarketplace, New York, NY (October 2020 - Present)
Lead the Data Platform team, overseeing the implementation of data platform architecture.
Improve and optimize data pipelines for scalability, performance, and reliability.
Design, build, and maintain core tools and Data products.
Lead the migration from legacy Data Warehouse to modern Data Lake, Warehouse, and Data Mart.
Develop proof-of-concept/reference implementations for components and pipelines.
Worked closely with Data Consumers to design and implement a process to create a unified Data Model, enabling all users to access from a 'single source of truth'
Mentor engineering teams and ensure the quality of delivery.
Consultant, Cigna/CVS-Health, New York, NY (Remote), October 2019 - September 2020
Developed, deployed, and monitored a Big Data platform for internal/external access to Med/Rx Claims.
Handled high-volume data transactions and resolved throughput/latency issues.
Provided mentorship to junior engineers and conducted code reviews.
Principal Software Engineer, ETrade, New Jersey/New York, April 2017 - October 2019
Engineered a Big Data platform for internal/external access to various data sets.
Managed near-live market data and ingested event streams from external vendors.
Developed Java Kafka-KStream modules and provided mentorship to junior engineers.
Consultant/Architect/Software Engineer, New Jersey/New York, June 2014 - January 2017
Designed and developed solutions for the ingestion and processing of massive batch data for CitiBank.
Managed and maintained consumer-facing Android/iOS apps and responsive web front-end for esPronto.
Cofounder/CTO, New Jersey/India, Apr 2009 - January 2014
Architected and Developed an on-line Stock Trading and cloud based Algo Trading System that can spawn off an Amazon EC2 container with private access to Market Data Cloud, Virtual Exchange and suite of Algorithms.
Additional Experience:
Consultant: Citi Bank, Archeus Capital (Hedgefund), New York, NY, Goldman Sachs, Princeton, NJ
Developer: Morgan Stanley, New York, NY, Wit SoundView (Start-up Broker), New York, NY
Skills:
Programming Languages: Java, Scala, Python, C++
Big Data Technologies: Spark, Kafka, Hadoop, Clickhouse, Databricks, Airflow
Cloud Platforms: AWS, Microservices, Kubernetes
Databases: Oracle, Cassandra, MySQL, ElasticSearch, Vector Databases
Other: AdTech, FinTech, HealthTech
LLM/GenAI Projects:
Architected a consumer facing Fintech App
Created a pipeline to curate custom dataset (Beautiful Soup, spaCy, SOLR, Airflow)
Fine Tuned a Llama 3 model using PEFT/QLoRA (Hugging Face)
Used Langgraph based agents for Email/ Stock Portfolio Analysis (Langgraph)
Deployed, monitored, managed model (Sagemaker)
Worked on the following components for a consumer facing Shopping App:
Evaluate vector embedding models and algorithms for indexing Embeddings.
Data Ingestion Pipelines to feed an Apache Lucene based Vector Database.
Microservice used to create embeddings.
RAG Pipelines to query the Vector Database using KNN-search.