Machine Learning Introduction Series by Women Who Code- Notes

I am attending ML Series initiated by Women Who Code which is six week long program where each week some topics in ML are covered. I will be posting my notes and assignments for each week in this blog.

ML Intro Series 1:

Series 1 was divided into two parts ML basics and Hypothesis testing, also Introductory lab on how to use colab was provided.

ML Basics/Intro

What is Machine Learning?

a subset of AI
class of computer algorithms that learns from data
algorithms that improve with experience
data and outputs are provided that results in a function that maps input to output, which can be used in multiple scenarios

Why ML now?

large computing power
big Data available
technologies that deal with data available
high storage capacity
higher RAM available
reduction in the gap between academia and industry

Terms

data
features
target variable

Types of ML

supervised
unsupervised
semi Supervised
reinforcement Learning

Supervised Learning

known input and output, training examples
unknown, function that maps input to output
goal to find the function

Types of Supervised Learning

regression when target is continuous
classification when target is categorical

We basically need Supervised Learning when there is no human expert for the task, humans can't describe task, function is changing frequently or we need personalised function for each use case.

Hypothesis Testing

Hypothesis test calculates some quantity under a given assumption.
The result of the quantity tells us whether assumption holds true or is violated.

Normal Distribution is a type of population distribution that is most commonly found in natural phenomena.

Conducting a hypothesis test

Test starts with an assumption that a null hypothesis, also called default hypothesis hold true and a violation this assumption called first hypothesis is also called alternate hypothesis.

P-Value

It is a quantity that can be used to interpret the result of hypothesis test.
Many test statistics can be used to calculate p-value.

Alpha

It is the significance level used to accept or reject a hypothesis.
It is generally 5% or 0.05. Lower the alpha higher the confidence.
Confidence is 1 minus alpha.

Errors in statistical tests

type 1 error which is false positive
type 2 error which is false negative

Homework

Linking colab with github
Test Statistics Understanding on Wikipedia
Z- Score, two tailed test for numerical problem

ML Intro Series 2:

In series 2 of ML series, Conditional Probability, Naive Bayes, Bayesian Learning were discussed and a lab on implementation of Naive based classifier using Scikit learn was also there.

Classification

In classification responses are categorical in nature.

Events can be

dependent
independent

Conditional Probability

defines probability between dependent events
occurrence of one event changes the probability of other event

Bayes Theorem Links

https://machinelearningmastery.com/bayes-theorem-for-machine-learning/

https://stattrek.com/probability/bayes-theorem.aspx

https://towardsdatascience.com/intro-to-bayesian-statistics-5056b43d248d

Bayes Theorem Formula

P(A|B)=P(B|A)*P(A)/P(B)

Prior Probability

Probability of an event that has occurred.

Posterior Probability

Probability of an event that is going to occur.

Naive Bayes Classifier is based on the principal of Bayes Theorem

It assumes all features are independent of each other.
All features contributes equally.

These assumptions can be wrong and due to these assumptions, this classifier is called naive.

OCR Image Text Detection and Image Manipulation Project

Developed as a course project, the main goal behind this project was to test ability to learn and use python libraries , use openCV to detect faces, tesseract to do optical character recognition and ability to use PIL to composite images together into contact sheets. Task was to write python code which allows one to search through the images looking for occurrences of keywords and faces, to perform text detection on newspaper images data and r eturn a contact sheet of all the faces which were located on the newspaper page which mentions that text . I divided whole task into subtasks into functions like, get files, binarise, to check is string is found, to chow faces, to show sheet, building contact sheet and used libraries to achieve each task, like for images used PIL, cv2, etc

Ankita Rafiz

Search This Blog

Machine Learning Introduction Series by Women Who Code- Notes

Comments

Post a Comment

Popular posts from this blog

OCR Image Text Detection and Image Manipulation Project

Password Store App in C

Open Banking App