Machine Learning: Difference between revisions
No edit summary |
No edit summary |
||
Line 127: | Line 127: | ||
https://medium.com/towards-data-science/the-mostly-complete-chart-of-neural-networks-explained-3fb6f2367464 | https://medium.com/towards-data-science/the-mostly-complete-chart-of-neural-networks-explained-3fb6f2367464 | ||
== linear regression in 6 lines of code: | |||
source: https://towardsdatascience.com/linear-regression-in-6-lines-of-python-5e1d0cd05b8d | |||
pip install scikit-learn | |||
import numpy as np | |||
import matplotlib.pyplot as plt # To visualize | |||
import pandas as pd # To read data | |||
from sklearn.linear_model import LinearRegression | |||
data = pd.read_csv('data.csv') # load data set | |||
X = data.iloc[:, 0].values.reshape(-1, 1) # values converts it into a numpy array | |||
Y = data.iloc[:, 1].values.reshape(-1, 1) # -1 means that calculate the dimension of rows, but have 1 column | |||
linear_regressor = LinearRegression() # create object for the class | |||
linear_regressor.fit(X, Y) # perform linear regression | |||
Y_pred = linear_regressor.predict(X) # make predictions | |||
plt.scatter(X, Y) | |||
plt.plot(X, Y_pred, color='red') | |||
plt.show() | |||
[[Category:DataScience]] | [[Category:DataScience]] |
Revision as of 00:01, 25 October 2018
getting started
google://getting started with machine learning
https://www.kaggle.com/wiki/GettingStartedWithPythonForDataScience - in progress
https://www.quora.com/I-want-to-learn-machine-learning-Where-should-I-start
http://thunderboltlabs.com/blog/2013/11/09/getting-started-with-machine-learning/
http://machinelearningmastery.com/machine-learning-for-programmers/
https://www.kaggle.com/dfernig/reddit-comments-may-2015/the-biannual-reddit-sarcasm-hunt/code
course: at coursera https://www.coursera.org/learn/machine-learning/home/week/1
understanding machine learning theory algorithms
algorithms
- random forest
- https://medium.com/rants-on-machine-learning/the-unreasonable-effectiveness-of-random-forests-f33c3ce28883
- Nearest Neighbors Classification
- http://scikit-learn.org/stable/modules/neighbors.html
tools
python + libs
- SystemML- a Universal Translator for Big Data and Machine Learning
image labeling
https://github.com/Labelbox/Labelbox
TensorFlow Playground
http://playground.tensorflow.org
sample data
blogs
Cool Projects
https://github.com/aficnar/slackpolice
- Aerospace Controls Lab
- http://acl.mit.edu/
- https://www.youtube.com/channel/UCVTxuaJsdMrk3UEcHVll9Yg
Data leaks
When data associated iwth the data set gives away the target data.
Primarily of concern in competition.
Unexpected data.
refrence: https://www.coursera.org/learn/competitive-data-science/lecture/5w9Gy/basic-data-leaks
Future peaking - using time series data that's not in the target time period, for example in the future.
Meta data leaks - for example file meta data, zip file meta data, image file meta data.
information hidden in ID and hashes,
and information hidden in row order and possibly duplicate rows
Questions and Investigation
What are "ground truths"?
corteges - what is this word
/Courera's Competitive Data Science Course
Reading Room
- Detecting tanks https://www.jefftk.com/p/detecting-tanks
Kaggle competitions:
Past solutions
http://ndres.me/kaggle-past-solutions/ https://www.kaggle.com/wiki/PastSolutions http://www.chioka.in/kaggle-competition-solutions/ https://github.com/ShuaiW/kaggle-classification/
Chapter
https://github.com/FlorianMuellerklein/Machine-Learning
Improving our neural network (96% MNIST) https://databoys.github.io/ImprovingNN/
https://iamtrask.github.io/2015/07/12/basic-python-network/
https://plot.ly/python/create-online-dashboard/
https://www.anaconda.com/download/
http://jupyter.org/install.html
== linear regression in 6 lines of code:
source: https://towardsdatascience.com/linear-regression-in-6-lines-of-python-5e1d0cd05b8d
pip install scikit-learn
import numpy as np import matplotlib.pyplot as plt # To visualize import pandas as pd # To read data from sklearn.linear_model import LinearRegression
data = pd.read_csv('data.csv') # load data set X = data.iloc[:, 0].values.reshape(-1, 1) # values converts it into a numpy array Y = data.iloc[:, 1].values.reshape(-1, 1) # -1 means that calculate the dimension of rows, but have 1 column linear_regressor = LinearRegression() # create object for the class linear_regressor.fit(X, Y) # perform linear regression Y_pred = linear_regressor.predict(X) # make predictions
plt.scatter(X, Y) plt.plot(X, Y_pred, color='red') plt.show()