Big Data: Difference between revisions

From Federal Burro of Information
Jump to navigationJump to search
Line 65: Line 65:


* map of tools http://insightdataengineering.com/blog/pipeline_map.html
* map of tools http://insightdataengineering.com/blog/pipeline_map.html
* "Big data dudes"

Revision as of 00:31, 28 June 2017

Overview

  1. node management
  2. key value stores
  3. storage management
  4. job management

Key aspects:

  • Integration
  • Analysis
  • Visualization
  • Work Load Optimization
  • Security
  • Governance


Key Values Stores

list:
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores

http://www.project-voldemort.com/voldemort/

https://en.wikipedia.org/wiki/Redis


Storage

Oracle Cluster File System (OCFS)

Old?

GFS

Hadoop

  • get key value with hbase (no sql)
  • sql with hive

Examples

Log data

Hadoop Analysis of Apache Logs Using Flume-NG, Hive and Pig
http://cuddletech.com/blog/?p=795

http://www.elasticsearch.org/ - also Elastic Search

JP GOES Sea Surface temperature data

"Geostationaary Operational Environmental Satellites (GOES) 6km Near Real-Time Sea Surface Temperature (SST) Documentation"

ftp://podaac-ftp.jpl.nasa.gov/allData/goes/L3/goes_6km_nrt/docs/goes_sst_doc.html

http://podaac-w10n.jpl.nasa.gov/w10n/allData/goes/L3/goes_6km_nrt/americas/2016/

what is the format of this data?

Reference

  • "Big data dudes"