Big Data: Difference between revisions
From Federal Burro of Information
Jump to navigationJump to search
Line 65: | Line 65: | ||
* map of tools http://insightdataengineering.com/blog/pipeline_map.html | * map of tools http://insightdataengineering.com/blog/pipeline_map.html | ||
* "Big data dudes" |
Revision as of 00:31, 28 June 2017
Overview
- node management
- key value stores
- storage management
- job management
Key aspects:
- Integration
- Analysis
- Visualization
- Work Load Optimization
- Security
- Governance
Key Values Stores
list:
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores
http://www.project-voldemort.com/voldemort/
https://en.wikipedia.org/wiki/Redis
Storage
Oracle Cluster File System (OCFS)
Old?
- https://oss.oracle.com/projects/ocfs2/
- https://oss.oracle.com/projects/ocfs/dist/documentation/RHAS_best_practices.html
GFS
Hadoop
- get key value with hbase (no sql)
- sql with hive
Examples
Log data
Hadoop Analysis of Apache Logs Using Flume-NG, Hive and Pig
http://cuddletech.com/blog/?p=795
http://www.elasticsearch.org/ - also Elastic Search
JP GOES Sea Surface temperature data
"Geostationaary Operational Environmental Satellites (GOES) 6km Near Real-Time Sea Surface Temperature (SST) Documentation"
ftp://podaac-ftp.jpl.nasa.gov/allData/goes/L3/goes_6km_nrt/docs/goes_sst_doc.html
http://podaac-w10n.jpl.nasa.gov/w10n/allData/goes/L3/goes_6km_nrt/americas/2016/
what is the format of this data?
Reference
- "Big data dudes"