Machine Learning: Difference between revisions

From Federal Burro of Information

Jump to navigation Jump to search

Revision as of 00:01, 25 October 2018

getting started

google://getting started with machine learning

https://www.kaggle.com/wiki/GettingStartedWithPythonForDataScience - in progress

https://www.quora.com/I-want-to-learn-machine-learning-Where-should-I-start

http://thunderboltlabs.com/blog/2013/11/09/getting-started-with-machine-learning/

http://machinelearningmastery.com/machine-learning-for-programmers/

https://www.kaggle.com/dfernig/reddit-comments-may-2015/the-biannual-reddit-sarcasm-hunt/code

course: at coursera https://www.coursera.org/learn/machine-learning/home/week/1

understanding machine learning theory algorithms

algorithms

random forest: https://medium.com/rants-on-machine-learning/the-unreasonable-effectiveness-of-random-forests-f33c3ce28883

Nearest Neighbors Classification: http://scikit-learn.org/stable/modules/neighbors.html

lstm: http://blog.echen.me/2017/05/30/exploring-lstms/

tools

python + libs

Caffe deep learning framework

SystemML- a Universal Translator for Big Data and Machine Learning

image labeling

https://github.com/Labelbox/Labelbox

TensorFlow Playground

http://playground.tensorflow.org

sample data

http://archive.ics.uci.edu/ml/datasets/Smartphone-Based+Recognition+of+Human+Activities+and+Postural+Transitions

blogs

http://blog.datumbox.com/

Cool Projects

https://github.com/aficnar/slackpolice

Aerospace Controls Lab: http://acl.mit.edu/; https://www.youtube.com/channel/UCVTxuaJsdMrk3UEcHVll9Yg

Data leaks

When data associated iwth the data set gives away the target data.

Primarily of concern in competition.

Unexpected data.

refrence: https://www.coursera.org/learn/competitive-data-science/lecture/5w9Gy/basic-data-leaks

Future peaking - using time series data that's not in the target time period, for example in the future.

Meta data leaks - for example file meta data, zip file meta data, image file meta data.

information hidden in ID and hashes,

and information hidden in row order and possibly duplicate rows

Questions and Investigation

What are "ground truths"?

corteges - what is this word

/Courera's Competitive Data Science Course

Reading Room

What a Deep Neural Network thinks about your #selfie

Detecting tanks https://www.jefftk.com/p/detecting-tanks

https://analyticsdefined.com/mining-enron-emails/

https://www.coursera.org/learn/competitive-data-science/lecture/5w9Gy/basic-data-leaks

https://opendatascience.com/blog/

Kaggle competitions:

https://www.kaggle.com/

Past solutions

http://ndres.me/kaggle-past-solutions/
https://www.kaggle.com/wiki/PastSolutions
http://www.chioka.in/kaggle-competition-solutions/
https://github.com/ShuaiW/kaggle-classification/

Chapter

https://github.com/FlorianMuellerklein/Machine-Learning

Improving our neural network (96% MNIST) https://databoys.github.io/ImprovingNN/

https://iamtrask.github.io/2015/07/12/basic-python-network/

https://plot.ly/python/create-online-dashboard/

https://www.anaconda.com/download/

http://jupyter.org/install.html

https://medium.com/towards-data-science/the-mostly-complete-chart-of-neural-networks-explained-3fb6f2367464

== linear regression in 6 lines of code:

source: https://towardsdatascience.com/linear-regression-in-6-lines-of-python-5e1d0cd05b8d

pip install scikit-learn

import numpy as np
import matplotlib.pyplot as plt  # To visualize
import pandas as pd  # To read data
from sklearn.linear_model import LinearRegression

data = pd.read_csv('data.csv')  # load data set
X = data.iloc[:, 0].values.reshape(-1, 1)  # values converts it into a numpy array
Y = data.iloc[:, 1].values.reshape(-1, 1)  # -1 means that calculate the dimension of rows, but have 1 column
linear_regressor = LinearRegression()  # create object for the class
linear_regressor.fit(X, Y)  # perform linear regression
Y_pred = linear_regressor.predict(X)  # make predictions

plt.scatter(X, Y)
plt.plot(X, Y_pred, color='red')
plt.show()

Retrieved from "https://wiki.quadratic.net/index.php?title=Machine_Learning&oldid=3991"

DataScience

Navigation menu