COVID-19 Contact Tracing

Android

COVID-19 Contract tracing is an application for an Android phone which notifies a user if they have come in contact with someone who reported COVID-19 symptoms or a test result.

According to Google’s play store policy and current on going situation they are not approving any applications that are related to COVID-19 until and unless they are affiliated or associated with any government health organization so you won’t be able to find this application on the store but here is a link for the beta test application:

https://appdistribution.firebase.dev/i/D2oxMUWN

This app basically works around the DP-3T protocol and use’s a phones Bluetooth to communicate with nearby users. Each phone running this app will generate a unique random key every 24 hours and share that key with the other phone which is in a close proximity.

For example and for ease of understanding we will be using one user as A and some other user as B

A and B both will generate a unique key every 24 hours. If A and B comes close to each other their phone will broadcast that day’s key for each other and both the users will save that key on their respective phone within a private secure database.

After few days (<14) suppose A reports that he/she have tested positive for COVID-19, then their phone will upload that result with their past 14 days keys, now B’s phone will periodically sync with server for latest data and it will get a notification saying you came in contact with someone who tested positive within last 14 days.

User A’s phone will keep uploading that result with new key generated from that day onward until the user reports that (s)he tested negative.

Review on Apache Flink

research

Abstract:

This paper describes an overview of apache Flink a data processing framework or tool developed by Apache organization. In this paper I provide some insights on apache Flink on how it works and how efficient it is compared to other data processing tools available out there. This paper also explains the need of such kind of tools and the importance of using these tools in order to make data driven decisions or getting to know more from Big Data. To process Big data several techniques have been developed over time like hadoop, map reduce, etc. Apache Flink is an open source stream processing framework developed by the Apache Software Foundation. The core of Apache Flink is a distributed streaming data flow engine written in Java and Scala [1] . This paper provides some contents for which a developer or data processing manager should be aware of when using apache Flink or implementing it to process huge amount of data flowing into the system at a high rate.

Introduction

Big Data can be considered as a vast amount of Data. In technical terms it is such a big amount of data which escapes beyond the processing capacity of any database system. The amount of data is so huge it becomes a problem to store or process it in a traditional database system. Whole amount of data together forms a very complex structure, it becomes difficult to iterate and gain useful information through it. The need for distributed data processing frameworks is growing tremendously because of the increase in demand and analysis achieved through data processing. There are basically two well-known data processing tools with API for data batches and data streaming Apache Flink and Apache Spark. This paper comprises of insights of how Apache Flink works and it’s comparison with other tools like Apache Spark or Apache Beam. Further this paper also consists of some statistics gathered by running a single node cluster for Apache Flink and Apache Spark and running the word count example. A very easy to read comparison between different data processing frameworks is provided in a tabular format which can be very useful for anyone who is deciding on which tool to use according to their application.

For full paper click here

Music Genre Classification Using Lyrics

research

Classification of music is a very important and heavily researched task in the field of NLP. Previous research in this field has focused on classifying music based on mood, genre, annotations, and artist. All the approaches either used audio features, lyric as text or both in combination.

Genre classification by lyrics is itself a clear Natural Language Processing problem. The end goal of NLP is to extract some sort of meaning from text. For music genre classification, this equates to finding features to classify music using lyrics. There are a wide range of scholarly and commercial applications for automated music genre classifiers. For example, classifiers could be used to automatically analyze and sort music into large databases. Music recommendation systems could be used to automatically analyze a user’s liking’s and recommend appropriate songs to listen. Music classifier can be used to recommend music based on the mood of users. Similarity analysis which is a part of music

Report

Data

We scrapped songs data from songsLyrics.com, and metrolyrics.com. Also, we used a song dataset from kaggle.com.
Our data included around 390,000 songs. Our data includes attributes like song, lyrics, year, artist, and a target attribute genre. For our task, we sampled 20,000 songs of each genre from our original dataset.

Pre-Processing

Removed instances with genres like “not available” and “other”,
Removed genres which didn’t have many instances.
Removed unnecessary characters using regular expression.
Removed stopwords using nltk’s english stopwords and stanford’s stopwords list.
We stemmed tokens in each song using nltk’s Snowball stemmer.
Some songs in our dataset had a non-english words. Using ftfy, we have fixed the
encoding of the text, and also we removed instances which had a non-english words even after we fixed the encodings.
We removed word such as ‘Chorus’ and ‘Verse’ which represent different parts of a song.

Target variable

For our classification task we selected 4 target variables which are the genres. Target variables are as follows:

Classical
Pop
Metal
Jazz

Features

Similarity with four genres: We calculated the top 30 words in each genre using tf-idf. We created four different features named metal_similarity, pop_similarity, rock_simliarity, and hip_hop_similarity. If a token appeared in any of the top-30 words of any genre we used its tf-idf value to calculate the cosine similarity with the tf-idf value of that token in a particular genre in which it appeared.
Pos tags: Using nltk tokenizer, get used a normalized count of pos tags.
Word2vec: We trained a word2vec model on the whole dataset, and brown corpus. After training word2vec model, we used it to generate word2vec vector of each token in each song.

Models used:

Dummy Classifier
kNN CLassifier
MLP Classifier
Gradient Boosting
Logistic Regression

Metrics Used:

Accuracy
F1 Score

Conclusion

After analysis of tf-idf values and confusion matrix we came to know how similar rock and metal songs are. Most of the Classifiers were also predicting wrong labels among these two genres. For the future work, we can use some more features such as parse trees, word endings, and length of a song to distinguish between these two genres and further increase accuracy of different classifiers.

Results:

Confusion Matrices:

[Project Link] (https://github.com/kartikprakash1993/Lyrics-Analysis)

Music Genre Classification Using Audio Signals

research

This project had an objective to perform a machine learning approach to classify a song based on it’s audio features. This approach can also be used for recommendation purposes on a huge scalable system. Music genres are hard to systematically and consistently describe due to their inherent subjective nature. This project uses a small dataset to just understand how to approach such kinds of problem and develop a model that can be easy to understand use.

Following ML algorithms were applied:

k-Nearest neighbors
Neural Networks with different parameters

DataSet

We have 2530 instances which are distributed among four genres (which is our target variable). The genres are as following:

Classical
Jazz
Pop
Metal

We initially did for 10 genres but due to huge data size we reduced our target categories to 4. We selected the above mentioned four genres because they have distinct style of music.

Data source

We collected the data from a website which provides free and legal download of music tracks based on their genres.Instead of downloading tracks manually we designed a web crawler using Selenium framework, a web automation tool, in java. We upgraded our crawler so that it loads dynamically the webpage and download the tracks. We ran our crawler on 4 different computers simultaneously due to the huge size of the data. It took around roughly 7 hours for each genre to be downloaded.

Target variable

For our classification task we selected 4 target variables which are the genres. Target variables are as follows:

Classical
Pop
Metal
Jazz

Features

We choose 5 features which are

Pitch (Chromagram) In music, the pitch of a note means how high or low a note is.
RMS The RMS (Root-Mean-Square) value is the effective value of the total waveform.
Tempo Tempo is the speed or pace of a given piece or subsection.
Roll-Off Roll-off is the steepness of a transmission function with frequency.
Zero Crossing Rate The zero-crossing rate is the rate of sign-changes along a signal.

Conclusions:

k-Nearest neighbours with different value of number of neighbours:

Neural network with different layers and solver:

Important features for k-NN

The most important feature for KNN : Roll Off

Important features for Neural Networks

The most important features of Neural Networks : Roll Off and Tempo Negative Features for Neural Network : Pitch and RMS and Zero Crossing Rate

Project Link

Rapido Mobile Application

Android

Fast food restaurants usually have individuals stand in a line or a queue; the first individual to come into the line will be the first to order and should, ideally, also be the first to receive their order when it is ready. One noticeable issue with waiting in line is that the size of the line may grow very large and the wait time may be too much for those that do not have the time to wait in it. Thus, the purpose of this application is to provide users with a means of ordering food without having to wait in line at the restaurant.

This app could potentially provide sellers with another means of increasing their revenue by attracting customers who may have never been to their restaurant and prefer convenience in their given situation.

Potential users of this application are:
– Those that work in offices that would want to quickly order food from where they work. They would assume that their order is being made while they are on their way to pick it up. As soon as they arrive at the restaurant, their order should be ready for pick-up or nearly complete.
– Those that are looking for places to eat after attending a large event such as a conference or a concert. They may be unfamiliar with the area and, as a result, use the app to conveniently find a place that suits their budget, schedule, and taste.
– In general, those that are busy such as students, parents, and professionals that want to save time when ordering food. One can place an order in any location where there is Wi-Fi since much of the functionality of this app depends on it (e.g communication with the server, payment). At this point in time, it has been decided that the customer can order from only one restaurant through this app.

Characteristics that are shown during the ordering process that may help the user make a decision in deciding which restaurant they would want to order from are:

(1) the distance between their current location and the location of a particular restaurant
(2) the traffic of the restaurant at a point in time (e.g the number of orders that have been placed at said restaurant)
(3) food prices

When a customer has placed an order, a notification will be sent to both the customer and the seller; the customer’s notification will show a confirmation of their order and the seller’s notification will signal to them that an order has been placed at their restaurant. Updated notifications will be sent to the customer periodically indicating the amount of time that it will take for their order to be ready. This information should help them in developing a plan to pick-up their food in a way that saves time. When (s)he arrives at the restaurant, the customer must present to the seller a unique QR code that is generated when they place their order. This is so that seller can keep track of orders that are placed through this app; keeping track of orders in this way may be useful for restaurants that are particularly busy. Scanning the QR code should then allow them to obtain their order.

Big Data & Hadoop Framework

research

In this paper, we describe Big Data and open source framework Hadoop. The statistics are provided, explaining formation of Big Data and the importance of analyzing it. Big Data possess many problems in the real world scenario due to it’s vast size, velocity and variety. Several techniques are developed to process Big Data. Hadoop framework is one such technique and is described along with the pseudo code for its mapper and reducer function. The file system of Hadoop – HDFS is explained as well with the help of appropriate diagram.

Big Data & Hadoop Implementation Download

Home Assistant – Alfred

robotics

HOME ASSISTANT as a system simply means making your home digital and being a personal assistant to you

It is digitizing of the home, housework or household activity.
It includes centralized control of appliances, your daily schedule, communication system, and inventory management.
It will work as an assistant to your needs and to your personal life. It will be an interactive device to which you will be able to talk and converse and can also assign tasks to remind you or keep in a database.
It can be used to increase the security of your house by using face recognition techniques as well as motion sensor and alerting system. It can act as mode of communication between two rooms or multiple rooms.
It will be able to learn from the environment and act accordingly. Inventory management is also implemented in this system which will be able to alert you about the things you are running low at your place.

Industrial Robotic Arm Prototype

robotics

Featured story in TOI (click here)

This project demonstrates an industrial robotic arm’s working and how it can be used for various purposes, especially in a sorting warehouse environment.

This project’s goal was to implement image processing that can be used to identify different objects and use a depth sensor to calculate the distance between the object and the arm. Doing this gives us the capability to automate various workflows that are prone to human errors.

Especially in big warehouses that require constant sorting and automated shipping procedures, this concept can be easily adapted to be super-efficient with less probability of errors.

Hardware used:

Servos
Arduino
DC motors
Ultrasound Sensors

Daily shop

Android

This application provides its user to easily iterate through options and buy products.The application retrieves the products from the database and allows the users to buy them.

It provides facilities like displaying different categories of products like bread & eggs, Fruits, Vegetables, Peas, rice & oil, etc. The exciting part of this application is that it works for both the customers and merchants as well. The application basically starts with providing the user with option of a merchant and a customer. Whatever the user selects, he is directed to the login page. The user is the requested to provide his/her login credentials. If the user is new a signup page is also provided. If the user is a customer he/she is directed towards the Category activity through which the user selects the product category. After selecting the category the user is provided with the list of products to choose from, once a user selects a product further product details are provided and the user is provided with an option of putting the product in the basket. A basket icon is displayed at the top of the screen from which user can check the contents of his/her basket and checkout with product he/she selected.

Kartik Prakash

Technology Enthusiast

Author: kartikprakash93

COVID-19 Contact Tracing

Review on Apache Flink

Abstract:

Introduction

Music Genre Classification Using Lyrics

Data

Pre-Processing

Target variable

Features

Models used:

Metrics Used:

Conclusion

Music Genre Classification Using Audio Signals

DataSet

Data source

Target variable

Features

Conclusions:

Rapido Mobile Application

Big Data & Hadoop Framework

Home Assistant – Alfred

Industrial Robotic Arm Prototype

Daily shop