Jit

About me!

My name is Satyajit Bhowmick(fondly known as 'Jit'). I'm a computer science graduate, an avid programmer and a tech buff. My research interest lies in the area of data science and machine learning. I believe in 'learning things by doing' and by doing so I strive for excellence in my area of interest. Most importantly I try to improve myself everyday to be a better version of me.
Find more details about me from this page. Keep visit to this page as I update it very often.

Academics

Master of Science (MS) in Computer Science

University of Cincinnati, USA
Thesis: A Fog-based Cloud Paradigm for Time Sensitive Applications
Specialization: Data Science, Cloud Computing, Big data, Machine Learning.

Bachelor of Technology (B. Tech.) in Electronics & Communication Engineering

West Bengal University of Technology, India
Final Project: Estimation of Hand Force from Surface Electromyography Signals using Artificial Neural Network

Skills

Publications

Suryadip Chakraborty, Satyajit Bhowmick , Paul Talaga, and Dharma P. Agrawal, “Fog Networks in Healthcare Application”, 13th IEEE International Conference on Mobile Ad hoc and Sensor Systems, Brasilia, Brazil, Oct 10-13, 2016 .
Satyajit Bhowmick, Suryadip Chakraborty, and Dharma P. Agrawal, “Study of Hadoop-MapReduce on Google N-Gram Datasets,” 12th IEEE International Conference on Mobile Ad Hoc and Sensor Systems (MASS), Dallas, USA, Oct 19-22, 2015, pp. 488 – 490.
Soumyajit Ganguly, Satyajit Bhowmick, Arnab Pal, Sauvik Das Gupta, ” A Novel Approach to Design a Customized Image Editor and Real-Time Control of Hand-Gesture Mimicking Robotic Movements on an I-Robot Create”, IOSR Journal of Computer Engineering (IOSR-JCE), e-ISSN: 2278-0661, p-ISSN: 2278-8727 Volume 16, Issue 3 (May-Jun. 2014), PP 56-63.
Satyajit Bhowmick, Rajesh Bag, SK Masud Hossain, Subhajit Ghosh, Shanta Mazumder, Sauvik Das Gupta, ” Modelling and Control of a Robotic Arm Using Artificial Neural Network”, IOSR Journal of Computer Engineering (IOSR-JCE), e-ISSN: 2278-0661, p-ISSN: 2278-8727Volume 15, Issue 2 (Nov. - Dec. 2013), PP 42-49.
Rajesh Bag, Satyajit Bhowmick, Rahul Ghosh, Abhishek Kumar Gond,” BLUETOOTH BASED AUTOMATIC HOTEL SERVICE SYSTEM USING PYTHON”, IOSR Journal of Computer Engineering (IOSR-JCE), e-ISSN: 2278-0661, p- ISSN: 2278-8727 Volume 15, Issue 2 (Nov. - Dec. 2013), PP 76-80.
Indranil Das, Jayeeta Bhattacharya, Pallabi Saha, Satyajit Bhowmick , Subhajit Das, Sauvik Das-gupta, ”EXPLORING THE RELATIONSHIP BETWEEN ELECTROMYOGRAPHY SIGNAL AND FLEX-ION FORCE USING ARTIFICIAL NEURAL NETWORK”, National Conference On Applied Electronics, Technically Sponsored By IEEE and CSIR , WB, India, Oct 25-26, 2013, pp. 77-82.
Abhishek Mallik, Diptyajit Das, Soumyajit Routh, Satyajit Bhowmick, Anik Karan, Sauvik Das Gupta, “MEMS based Viscometric Biosensor to continuously detect Glucose level of a diabetic patient”, International Journal of Scientific & Engineering Research, ISSN 2229-5518 , Volume 4, Issue 12, December-2013, PP 1710-1713.

Projects

Text Analysis Web Service Container in Docker

This is a web app, put in a docker container, which allows us to analyze texts. This service accepts input via GET parameter and returns the result as JSON strings.

NASA Server Log Analysis using Apache Spark

Efficiently used Spark for exploration and analysis of HTTP requests data from NASA Kennedy Space Center web server. Found facts like # error paths, # unique hosts, avg. # daily requests per host, 404 status code records, # 404 errors per day, etc.

Extreme Programming (XP) – The Brain of a Vending Machine

Developed the algorithm and programming model for the brain of an intelligent vending machine with a focus on Test Driven Development.

Remote File Sharing Portal

Implemented a multi-user file sharing system using Remote Procedure Call. A command-line file sharing system where multiple users can share files with each other. File transfers can be performed directly peer to peer or with the server as an intermediary.

Password Cracking with Map Reduce

An application that can breach a large database and can crack passwords. Great improvement in performance and reduced time were achieved for this computation and data intensive task using map reduce in python.

Twitter Trend Analysis with Hadoop

Analysis on more than 1 TB of Twitter data in Map-Reduce framework using Hadoop streaming mode. Found trends and useful information like avg. # tweets, avg. tweet lengths, tweet length ratio for a user, etc., form the data.

Rendering Geo Tweets

Consumed geo tweets from Twitter stream. Nearby tweets are available and rendered on a map when a user provides coordinates.

Big Data Analysis using Apache Pig

Analyzed the twitter data of a user named @PrezOno. The relevant information was extracted and analyzed very fast from more than 1 TB of data using Apache Pig. I found the hour of the day does @PrezOno’s tweet the most on average.

Dynamic Web App using Google App Engine Web Services

Created a user facing dynamic Web App on top of Google App Engine services. App engine APIs have been used to make the app user friendly and multi-functional. Implementation includes usage of memcache, secure login service and email service.

Price/Performance Analysis of Amazon EC2 for Serving Dynamic Web Contents

Analysis of on demand EC2 instances, serving dynamic web pages, giving the lowest cost. Web server does computationally intensive task. To automate EC2 web server performance Apache JMeter has been used. Web script had been benchmarked on different instance flavors and response time differences were analyzed.

Github

Professional Experience

Software Engineer, Caterpillar Inc, Aug ’17 - Present

* Maintain and improve the SuperComm2 application system and webpage for high-volume big data repository of SuperComm2 Command Center.
* Write robust applications to convert high frequency raw machine data(.SC2) to other human readable formats like GH5, THD, TDMS, XML.
* Created data pipeline to transfer real-time high frequency data to AWS S3 buckets while maintaining the sanity of data.
* Write multithreaded python scripts that run across multiple servers.
* Develop and maintain proprietary python data analysis framework called PyDAX.
* Responsible for analyzing various cross-functional, multi-platform applications systems enforcing Python best practices and provide guidance in making long term architectural design decisions.
* Used Spark to perform data exploration and mining on real Apache web server log files.
* Consumed consumer geo data from Caterpillar feedback stream and rendered locations on map.
* Efficiently used clustering algorithms like SVM, KNN.
* Successfully used python Scikit-learn for predictive modelling.
* Used Hadoop and MapReduce for Caterpillar equipment channel data analysis.
* Developed a Restful service that mines more than 70TB of data and responds to user requests via APIS using Python Flask framework, Celery, RabbitMQ, Couchbase, Tornado and Nginx.
* Built web scraper and extracted data from web pages using python and BeautifulSoup.
* Use Python modules such as requests, urllib, urllib2 for web crawling and JavaScript as well.
* Successfully set up an intranet mail server using hMailServer and built email crawler using python and eml_parser to parse email body.
* Used python’s SMTP protocol to send customized email.
* Create RESTFUL API's for several of our Intranet applications using open source software packages.
* Develop remote integration with third party platforms by using RESTful web services.
* Use Python to extract information from XML files.
* Actively involved in the initial software development life cycle (SDLC) of requirement gathering and in suggesting system configuration specifications during client interaction.
* Generate various graphs for business decision making using Python matplotlib library.
* Expertise in data visualization with Tableau. Embed tableau dashboards on SuperComm2 Website to ensure better and dynamic user experience.
* Design and create the database tables and write SQL queries to access Oracle.
* Use Couchbase and Redis to reduce overhead and respond quickly to API requests.
* Use AWS(Amazon Web services) EMR for improved efficiency of storage. * Utilize Python Libraries like Boto3, NumPy, Pandas for AWS.
* Use Celery as task queue and RabbitMQ as messaging broker to execute asynchronous tasks.
* Comfortable in creating and maintaining Docker containers.
* Involved in writing Shell scripts to automate tasks and application specific syncs / backups and other schedulers.
* Design, model and optimize relational databases tables in MySQL.
* Perform requirements gathering and work closely with the architect in designing.
* Use agile development tools and methodologies in Python, Git, PyCharm, code review.
* Work on data extraction, data mapping and data insertion; the process of data migration.
* Hands on experience in working with the Cassandra Query Language (CQL) for querying the data present in the Cassandra.
* Resolved several hidden bugs caused by complicated multithreading issues such as race conditions caused by asynchronous events.
* Designed user interactive web pages as the front-end part of the web application using various web technologies like HTML, JavaScript, jQuery, AJAX and implemented CSS for better appearance and feel,
* Deployed projects into Heroku and used GitHub version control system.

Research Associate, University of Cincinnati, Aug ’16 – July ‘17