Big Data: Difference between revisions
Line 85: | Line 85: | ||
* [[On Thinking]] | * [[On Thinking]] | ||
* https://github.com/tdhopper/Data-Science-Conference-Bingo |
Revision as of 00:47, 8 May 2018
Overview
- node management
- key value stores
- storage management
- job management
Key aspects:
- Integration
- Analysis
- Visualization
- Work Load Optimization
- Security
- Governance
Key Values Stores
list:
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores
http://www.project-voldemort.com/voldemort/
https://en.wikipedia.org/wiki/Redis
Storage
Oracle Cluster File System (OCFS)
Old?
- https://oss.oracle.com/projects/ocfs2/
- https://oss.oracle.com/projects/ocfs/dist/documentation/RHAS_best_practices.html
GFS
Hadoop
- get key value with hbase (no sql)
- sql with hive
Examples
Log data
Hadoop Analysis of Apache Logs Using Flume-NG, Hive and Pig
http://cuddletech.com/blog/?p=795
http://www.elasticsearch.org/ - also Elastic Search
JP GOES Sea Surface temperature data
"Geostationaary Operational Environmental Satellites (GOES) 6km Near Real-Time Sea Surface Temperature (SST) Documentation"
ftp://podaac-ftp.jpl.nasa.gov/allData/goes/L3/goes_6km_nrt/docs/goes_sst_doc.html
http://podaac-w10n.jpl.nasa.gov/w10n/allData/goes/L3/goes_6km_nrt/americas/2016/
what is the format of this data?
Learning Progress and Recognition
https://courses.cognitiveclass.ai/certificates/493c0df647484b2082c76328e46feaa5
deep learning https://campus.datacamp.com/courses/deep-learning-in-python/basics-of-deep-learning-and-neural-networks?ex=1
Open data - Sources
https://www.kaggle.com/datasets
Reference
- "Big data dudes"