Professional and Academic Highlights
- Awarded Masters in Computer Science with Distinction from the University of Edinburgh
- M.Sc. dissertation involved work on Join-Algorithms using Hadoop.
- Experience in managing and using live NoSql clusters.
- Given a public talk at Cassandra London meetup on Hadoop integration in Cassandra
- Worked for Microsoft India for 2 years.
- B.Tech. in Computer Science and Engineering with cumulative GPA of 9.1 (on 10).
- Awarded Best Student Project during undergraduate course.
- Amazon Cloud
Relevant Work Experience
Meltwater U.K. Ltd.
(August 2018 - Present)
Designation - Senior Software Engineer II - Cloud Architect
Responsible for productionizing Artificial Intelligence algorithms designed for intelligently crawling and extracting editorial content from the Internet. Technologies involved:
- AWS Cloud Stack - including, but not limited to:
- Step functions
- Route 53
- Dropwizard/Jersey REST
- Terraform / Serverless Framework
QuantumBlack Visual Analytics Ltd., London, UK
(October 2016 - June 2018)
Designation - Platform Engineer
Joined the QuantumBlack Platform Team working to design and create backend systems that support various Data Science solutions in the company.
Mediasift Ltd. (trading as Datasift Inc.), Reading, UK
(September 2011 - October 2016)
Designation - Engineering Team Lead
Started in the company as a Big Data Engineer. As part of the data-warehousing team at Datasift, primarily responsible for archiving, curating and retrieval of massive amounts of social data accumulated every day. The data is in the order of 2 TB/day. Technologies used include -
Played a key role in development of Historics platform for mining archived social media data for customers.
Was promoted to Engineering Team Lead in late 2014. Involved at important stages in design and development of the PYLON for Facebook Topic Data product. Technologies used include -
Imagini Europe Limited, London, UK
(December 2010 - September 2011)
Designation - Back-End Developer
Working on Data Warehousing solutions using NoSql technologies. Key player in build, maintenance and use of the following solutions, standalone or in conjunction with each other -
- Cassandra – Primary Data-store
- Over 4 TB of data that is used for real-time access and analytics
- Hadoop - Main analytics engine
MSc - Computer Science [Awarded Distinction]
(Sept 2009 - Sept 2010)
School of Informatics, University of Edinburgh (Edinburgh, U.K.)
Specialization Modules -
- Design & Analysis of Parallel Algorithms
- Advance Databases
- Distributed Systems
- Parallel Programming Languages & Systems
- Human Computer Interaction
- Compiler Optimisation
- Text Technologies & Information Retrieval
- Querying & Storing XML
Course work -
- Dissertation - Join algorithms using Map/Reduce - Evaluated existing join algorithms used in contemporary systems that use Map/Reduce. Designed two new algorithms for multi-way joins. Properties like selectivity factor of a join were exploited in design of the algorithms. The project was implemented using Hadoop and HDFS. The coding was entirely done in Java. The evaluation was done based on speed-up, scale-up and network I/O. The thesis was awarded distinction and is available for download from the University website here
- Advanced Databases - Extended the query engine of a home grown database to implement External Sort and Merge join algorithms. I managed to secure 100% in the coursework and the work was appreciated to be the best amongst the batch.
- Information Retrieval - Developed a web crawler (using Python) capable of harvesting a set of hyper-linked news stories from a web-server. Implemented content-extraction algorithm using plateau-based method. Also implemented near-duplicate detection feature using SimHash algorithm. Also developed a system for searching images based on keywords (tags). Implemented exact-match, best-match and pseudo-relevance feedback algorithm.
- Querying and Storing XML - Implemented an algorithm for updating XML via Relational Databases with 2 other collaborators. The project involved incrementally updating recursively stored XML, stored in an existing relational database, as opposed to previous approaches that shred the entire XML document into a newly created database of a newly designed schema.
- Distributed Systems - Implemented a simulation of Chandy-Lamport snapshot algorithm which is used in distributed systems for recording a consistent global state of an asynchronous system.
Previous Work Experience
Microsoft India (R&D) Private Limited, Hyderabad, India
(June 2007 - April 2009)
Designation - Software Development Engineer
All projects involved work on Microsoft Technology Stack. Almost all the coding was done in C# and all projects used Visual Studio Team Suite.
B.Tech - Computer Science Engineering
(July 2003 - May 2007)
Amrita School of Engineering, Amrita Vishwa Vidyapeetham (Coimbatore, India)
- Scored cumulative grade point average of 9.1 (on 10)
- Won Best Student Project award for a project on Machine Translation as part of my final year student project. The application translates given input text in English to Hindi with acceptable levels of accuracy.
Non-Technical Work Experience
Indian School of Business
(June 2009 - August 2009)
Designation - Associate, Admissions and Financial Aid
- Worked on strengthening the e-marketing initiative for the Admissions Team.
- Co-ordinated implementation of new CRM system.
Richmond Place University Accommodation
(March 2010 - Sept 2010)
Designation - House Assistant
- Assisting the warden in maintaining discipline and decorum.
- Organizing and managing social events for the residents.