Machine Learning
getting started
google://getting started with machine learning
https://www.kaggle.com/wiki/GettingStartedWithPythonForDataScience - in progress
https://www.quora.com/I-want-to-learn-machine-learning-Where-should-I-start
http://thunderboltlabs.com/blog/2013/11/09/getting-started-with-machine-learning/
http://machinelearningmastery.com/machine-learning-for-programmers/
https://www.kaggle.com/dfernig/reddit-comments-may-2015/the-biannual-reddit-sarcasm-hunt/code
course: at coursera https://www.coursera.org/learn/machine-learning/home/week/1
understanding machine learning theory algorithms
Course Plan
Understanding Machine Learning with Python
- By Jerry Kurata
- May 16, 2016
- Beginner
- This is rated 4.52821 (638)
- 1h 53m
- Offered by: pluralsight
- https://app.pluralsight.com/library/courses/python-understanding-machine-learning/table-of-contents
- Status: Not started
Building Machine Learning Models in SQL Using BigQuery ML
- Building Machine Learning Models in SQL Using BigQuery ML
- By Janani Ravi
- Nov 19, 2018
- Beginner
- This is rated 4.92308 (13)
- 1h 27m
- Offered by pluralsight
- https://app.pluralsight.com/library/courses/sql-bigquery-ml-building-machine-learning-models/table-of-contents
- Status: Not started
Preparing Data for Machine Learning
Preparing Data for Machine Learning By Janani Ravi Oct 28, 2019 Beginner This is rated 4.4375 (32) 3h 24m https://app.pluralsight.com/library/courses/preparing-data-machine-learning/table-of-contents
Preparing Data for Feature Engineering and Machine Learning
Preparing Data for Feature Engineering and Machine Learning By Janani Ravi Oct 28, 2019 Beginner This is rated 4.64 (25) 3h 17m https://app.pluralsight.com/library/courses/preparing-data-feature-engineering-machine-learning/table-of-contents
Building End-to-end Machine Learning Workflows with Kubeflow
Building End-to-end Machine Learning Workflows with Kubeflow By Abhishek Kumar Apr 23, 2020 Beginner No Rating 3h 30m https://app.pluralsight.com/library/courses/building-end-to-end-machine-learning-workflows-kubeflow/table-of-contents
Data Wrangling with Pandas for Machine Learning Engineers
- Data Wrangling with Pandas for Machine Learning Engineers
- By Mike West
- Aug 08, 2018
- Beginner
- This is rated 3.82051 (39)
- 1h
- https://app.pluralsight.com/library/courses/pandas-data-wrangling-machine-learning-engineers/table-of-contents
Building Your First scikit-learn Solution
- Building Your First scikit-learn Solution
- By Janani Ravi
- May 01, 2019
- Beginner
- This is rated 4.7377 (61)
- 2h 7m
- https://app.pluralsight.com/library/courses/building-first-scikit-learn-solution/table-of-contents
Build, Train, and Deploy Your First Neural Network with TensorFlow
- Build, Train, and Deploy Your First Neural Network with TensorFlow
- By Jerry Kurata
- Jan 22, 2020
- Beginner
- This is rated 4.58333 (36)
- 2h 47m
- https://app.pluralsight.com/library/courses/build-train-deploy-first-neural-network-tensorflow/table-of-contents
Network Analysis in Python: Getting Started
- Network Analysis in Python: Getting Started
- By Artur Krochin
- Apr 09, 2019
- Beginner
- This is rated 4.92857 (14)
- 1h 58m
- https://app.pluralsight.com/library/courses/python-network-analysis-getting-started/table-of-contents
Building Features from Numeric Data
- Building Features from Numeric Data
- By Janani Ravi
- Apr 07, 2019
- Beginner
- This is rated 5 (15)
- 2h 25m
- https://app.pluralsight.com/library/courses/building-features-numeric-data/table-of-contents
More
https://app.pluralsight.com/library/courses/applying-machine-learning-data-gcp/table-of-contents
algorithms
- random forest
- https://medium.com/rants-on-machine-learning/the-unreasonable-effectiveness-of-random-forests-f33c3ce28883
- Nearest Neighbors Classification
- http://scikit-learn.org/stable/modules/neighbors.html
tools
python + libs
- SystemML- a Universal Translator for Big Data and Machine Learning
image labeling
https://github.com/Labelbox/Labelbox
TensorFlow Playground
http://playground.tensorflow.org
sample data
blogs
Cool Projects
https://github.com/aficnar/slackpolice
- Aerospace Controls Lab
- http://acl.mit.edu/
- https://www.youtube.com/channel/UCVTxuaJsdMrk3UEcHVll9Yg
https://qz.ai/spotting-circling-helicopters/
Data leaks
When data associated iwth the data set gives away the target data.
Primarily of concern in competition.
Unexpected data.
refrence: https://www.coursera.org/learn/competitive-data-science/lecture/5w9Gy/basic-data-leaks
Future peaking - using time series data that's not in the target time period, for example in the future.
Meta data leaks - for example file meta data, zip file meta data, image file meta data.
information hidden in ID and hashes,
and information hidden in row order and possibly duplicate rows
Questions and Investigation
What are "ground truths"?
corteges - what is this word
/Courera's Competitive Data Science Course
Reading Room
- an good overview the the data science cycle in a general sense: https://cloud.google.com/ml-engine/docs/tensorflow/data-prep
- Detecting tanks https://www.jefftk.com/p/detecting-tanks
- Kaggle competitions: https://www.kaggle.com/
- University of Toronto Machine Learning http://www.learning.cs.toronto.edu/theses.html
Past solutions
http://ndres.me/kaggle-past-solutions/ https://www.kaggle.com/wiki/PastSolutions http://www.chioka.in/kaggle-competition-solutions/ https://github.com/ShuaiW/kaggle-classification/
https://towardsdatascience.com/how-to-use-dataset-in-tensorflow-c758ef9e4428
https://towardsdatascience.com/how-to-train-neural-network-faster-with-optimizers-d297730b3713
NIPS - Neural Information Processing Systems
Demos and Labs
https://codelabs.developers.google.com/codelabs/scd-babyweight2/index.html#0
https://github.com/GoogleCloudPlatform/training-data-analyst
- Jaz Quick start
- use your GPU / TPU for ML:
- https://jax.readthedocs.io/en/latest/notebooks/quickstart.html
Image processing
- Christopheraburns / gluoncv-yolo-playing_cards
- https://github.com/Christopheraburns/gluoncv-yolo-playing_cards/blob/master/Yolov3.ipynb
Chapter
https://github.com/FlorianMuellerklein/Machine-Learning
Improving our neural network (96% MNIST) https://databoys.github.io/ImprovingNN/
https://iamtrask.github.io/2015/07/12/basic-python-network/
https://plot.ly/python/create-online-dashboard/
https://www.anaconda.com/download/
http://jupyter.org/install.html
linear regression in 6 lines of code
source: https://towardsdatascience.com/linear-regression-in-6-lines-of-python-5e1d0cd05b8d
pip install scikit-learn
import numpy as np import matplotlib.pyplot as plt # To visualize import pandas as pd # To read data from sklearn.linear_model import LinearRegression
data = pd.read_csv('data.csv') # load data set X = data.iloc[:, 0].values.reshape(-1, 1) # values converts it into a numpy array Y = data.iloc[:, 1].values.reshape(-1, 1) # -1 means that calculate the dimension of rows, but have 1 column linear_regressor = LinearRegression() # create object for the class linear_regressor.fit(X, Y) # perform linear regression Y_pred = linear_regressor.predict(X) # make predictions
plt.scatter(X, Y) plt.plot(X, Y_pred, color='red') plt.show()
Conferences
- TMLS2020 - Toronto Machine Learning Summit 2020