Machine Learning: Difference between revisions

From Federal Burro of Information
Jump to navigationJump to search
Line 83: Line 83:


* https://analyticsdefined.com/mining-enron-emails/
* https://analyticsdefined.com/mining-enron-emails/
* https://www.coursera.org/learn/competitive-data-science/lecture/5w9Gy/basic-data-leaks


== Chapter ==
== Chapter ==

Revision as of 03:16, 9 May 2018

getting started

google://getting started with machine learning

https://www.kaggle.com/wiki/GettingStartedWithPythonForDataScience - in progress

https://www.quora.com/I-want-to-learn-machine-learning-Where-should-I-start

http://thunderboltlabs.com/blog/2013/11/09/getting-started-with-machine-learning/

http://machinelearningmastery.com/machine-learning-for-programmers/

https://www.kaggle.com/dfernig/reddit-comments-may-2015/the-biannual-reddit-sarcasm-hunt/code

course: at coursera https://www.coursera.org/learn/machine-learning/home/week/1

understanding machine learning theory algorithms

algorithms

random forest
https://medium.com/rants-on-machine-learning/the-unreasonable-effectiveness-of-random-forests-f33c3ce28883
Nearest Neighbors Classification
http://scikit-learn.org/stable/modules/neighbors.html
lstm
http://blog.echen.me/2017/05/30/exploring-lstms/

tools

python + libs

sample data

http://archive.ics.uci.edu/ml/datasets/Smartphone-Based+Recognition+of+Human+Activities+and+Postural+Transitions

blogs

http://blog.datumbox.com/

Cool Projects

https://github.com/aficnar/slackpolice


Aerospace Controls Lab
http://acl.mit.edu/
https://www.youtube.com/channel/UCVTxuaJsdMrk3UEcHVll9Yg

Data leaks

When data associated iwth the data set gives away the target data.

Primarily of concern in competition.

Unexpected data.

refrence: https://www.coursera.org/learn/competitive-data-science/lecture/5w9Gy/basic-data-leaks

Future peaking - using time series data that's not in the target time period, for example in the future.

Meta data leaks - for example file meta data, zip file meta data, image file meta data.

information hidden in ID and hashes,

and information hidden in row order and possibly duplicate rows

Reading Room

Chapter

https://github.com/FlorianMuellerklein/Machine-Learning

Improving our neural network (96% MNIST) https://databoys.github.io/ImprovingNN/

https://iamtrask.github.io/2015/07/12/basic-python-network/

https://plot.ly/python/create-online-dashboard/

https://www.anaconda.com/download/

http://jupyter.org/install.html

https://medium.com/towards-data-science/the-mostly-complete-chart-of-neural-networks-explained-3fb6f2367464