Experience

Samsung Research America
Currently working  as Staff Machine Learning Engineer at Samsung Research America. (August 2018 to current )

tl;dr Love to and will automate your tedious tasks way, experienced with on-device AI, big gpu AI , backend engineering, frontend dev, little bit of design and lots of software engineering.

Currently deep in on-device AI. Running AI models on phone + making them fast = happiness. I've been drilling deep into On-Device AI, more specifically, running running LLMs 🦙 on mobile 📱. Extremely interested in GPU programming (NVIDIA / Qualcomm).

  • ✅ Writing raw openCL kernels for ops on phone.
  • ✅ Profiling cuda models to identify bottlenecks at ops level.
  • ✅ Hacking llama.cpp for running our custom models.
  • ✅ Add operators to executorch for mobile NPU, running LLMs on NPU.
  • MLC-LLM for mobile GPUs.
  • Below are some highlights that gave me the most dopamine rush ⚡.

    🌍 Product side highlights

    • Developed the first version of the training pipeline for Samsung Bixby© assistant's deep learning based intent classifier. Rewrote our research modeling and training code for a 50% speedup in training time.
    • Helped with on-device deployment of a part of Samsung Bixby© . Extracting performance with C++ code was some sweet, sweet satisfaction. Do we have a spare 10ms in our inference budget? Yes, sir, we do!

    ⚡ Optimization adventures ⛰️

    • With some trickery and a little additional memory (tradeoff), was able to rewrite a python loop into pure torch code and voila, a 100x speedup! (Blog post coming soon!)
    • When doing experiments for the Samsung Bixby© training model, Setup the entire training pipeline in Jenkins so different experiments can have their own artifacts, and is a one-click solution for deploying experiments. Being able to go through all previous experiments, and even pick a run and benchmark it's model was a productivity boost like none other.
    • Automated some data filling with less than 100% accuracy (something > nothing) in a process which was otherwise a purely manual task. The mental burden and time taken for the task went down from 1 hour to <5 minutes on average. Mostly because editing some wrong info is way easier than filling it completely manually. No dopamine rush feels higher than the one of people thanking you personally for making their life easier.

    Early on in my career at SRA, my day-to-day included tackling cool problems of system design, implementing and scaling machine learning models (still do), developing scalable back end services and a front end that is a pleasure to use and a "lifesaver" (Slack messages to back this up).

    Also used to be an intern from August 2017 - June 2018. As a research intern, Gained lots of experience implementing machine learning solutions end to end. Developed a distributed system framework that allows switching Natural Language Processing engines in real time. Also had papers published in reputable NLP conferences like ACL 2018 and COLING 2018.

Florida State University, Computer Science Department
Worked  as Teaching Assistant at Florida State University, Computer Science Department. (January 2017 to May 2017 )
Worked as a TA for Data Communications, a grad level course for the Computer Science department at FSU.
Florida State University, Computer Science Department
Worked  as Teaching Assistant at Florida State University, Computer Science Department. (September 2016 to December 2016 )
Served as a TA for Concurrent, Parallel and Distributed Programming, a grad level course for the Computer Science department at FSU.
Florida State University, Human Resources
Worked  as Web Application Developer at Florida State University, Human Resources. (September 2015 to August 2016 )
Developed web applications for the HR department. Also helped maintain the current legacy code running website. Worked on front-end, back-end and server management for the web applications.
Inkrasa
Worked  as an Intern at Inkrasa. (January 2015 to May 2015 )
Developed an e-commerce website that sells paintings. Developed back-end as well as front-end for the website. Also integrated support for payment using the PayU gateway.

Publications

A New Concept of Deep Reinforcement Learning based Augmented General Tagging System - Y. Wang, A. Patel, Y.Shen, H. Jin
Published
Accepted at COLING 2018
In this paper, a new deep reinforcement learning(DRL) based coaching model (DCM) for slot filling in SLU tasks is proposed. Worked on the core algorithm development, and did the whole implementation including putting all the models behind an API and the android chat app to interact with the QA agent.
CRUISE: Cold-Start New Skill Development via Iterative Utterance Generation. - Y. Shen, A.Ray, A. Patel, H. Jin
Published
Accepted at ACL 2018
In this paper, We present a system, CRUISE, that guides developers to build a high quality natural language understanding (NLU) engine from scratch without having to manually generate and annotate a large number of utterances. instead, we design a hybrid rule based and data-driven approach with the capability to iteratively generate more and more utterances Developed a common framework for NLU engines, and back end as well as the front end of the system.

Achievements

Toastmasters Secretary at Sasmung Speaks Toastmasters club. (2019 - 2020)
Won Second Prize at the "ACM Fall 2015 Programming Contest".
Developed and published a VR (Virtual Reality) game on the Google Play Store. Look for Shiny Bikes VR on the Play Store.
Completed "Linux From Scratch"
Reputed Stack Overflow profile.

Projects

tokenizers-cpp: Huggingface tokenizers for C++
Contributed to open source c++ wrapper for huggingface tokenizers. Also have instructions for cross-compiling them for android.
rusty-celery
Contributor for rust library rusty-celery. Helped develop support for redis broker.
oauthlib
Contributing to the awesome python oauthlib library. This is the library on which other python Oauth frameworks and libraries build upon. Great folks there, give them a ⭐️ for their hardwork! Love reading RFCs? Join us!
Voice Control for PC
Developing a generic framework that would allow developers to incorporate voice control for their apps on PC. Also a firefox extension to control the browser. The actual voice engine is a private repo as of now.
Neural Net controlled Ants
Developed neural networks that learns in an unsupervised manner to find a path towards food. It employs genetic algorithms to learn to find a path. No external libraries were used. Check out the live demo here.
Distributed training for RNN
This project ports an existing implementation of a character RNN to a variant that can be run in a distributed environment (multiple machines, multiple GPUs). The aim was to demonstrate how to port existing tensorflow implementations to a distributed mode.
Rellit
Browser extension to browse Reddit with a nice coat of material design. Live on Mozilla Addons Store. Coming soon on Chrome Store (Under review).
RetroScII
Converts images into ASCII art. Creates a model based on a font file and then based on it, creates ASCII art for any input image.
Tron for VR
Developed VR version of the old classic tron game. Developed it using Unity and published the android app on the Google Play Store. Check it out at Play Store
Flaticon scraper
Scraps icons from Flaticon.com based on a list of search queries.
Hungrify!
Developed an android app that suggests a restaurant based on your location. Entire back-end was developed in Laravel. Was developed as a part of a hackathon (HackFSU'16).
P2P File Sharing
Developed a Peer to Peer File Sharing Program in C++. Created an algorithm for searching any file on the network without any prior knowledge of its existence or location, using keywords. Uses a combination of processes and threads.
MVC framework from PHP
Developed a basic MVC framework from scratch for PHP based web applications.
Comments Package for PHP
Published a Laravel package on packagist, called “asp/commenter” that allows you to add comments functionality to any page with just 6 lines of code, 3 in background and 3 for frontend display. The app takes care of creating and managing topic threads, comments and replies. Get it from here.
lolpython
Prints rainbow text on the terminal. An enhanced python port of the classic lolcat. Also available as a python package from PyPI.

Skills

Logo for llama.cpp Logo for CUDA

Education

MS (C.S.)
Florida State University (Graduated December 2017).
GPA: 3.8 / 4.0
B.Tech (I.T.)
Nirma University, Gujarat, India (Graduated May 2015).
GPA: 7.8/10
High School
Don Bosco High School, Vadodara, India (Graduated March 2011).
GPA: 86/100