Big Data

From Federal Burro of Information
Jump to navigationJump to search

Overview

  1. node management
  2. key value stores
  3. storage management
  4. job management

Key aspects:

  • Integration
  • Analysis
  • Visualization
  • Work Load Optimization
  • Security
  • Governance


Key Values Stores

list:
http://www.metabrew.com/article/anti-rdbms-a-list-of-distributed-key-value-stores

http://www.project-voldemort.com/voldemort/

https://en.wikipedia.org/wiki/Redis


Storage

Oracle Cluster File System (OCFS)

Old?

GFS

Hadoop

  • get key value with hbase (no sql)
  • sql with hive

Examples

Log data

Hadoop Analysis of Apache Logs Using Flume-NG, Hive and Pig
http://cuddletech.com/blog/?p=795

http://www.elasticsearch.org/ - also Elastic Search

JP GOES Sea Surface temperature data

"Geostationaary Operational Environmental Satellites (GOES) 6km Near Real-Time Sea Surface Temperature (SST) Documentation"

ftp://podaac-ftp.jpl.nasa.gov/allData/goes/L3/goes_6km_nrt/docs/goes_sst_doc.html

http://podaac-w10n.jpl.nasa.gov/w10n/allData/goes/L3/goes_6km_nrt/americas/2016/

what is the format of this data?

Learning Progress and Recognition

https://courses.cognitiveclass.ai/certificates/493c0df647484b2082c76328e46feaa5

https://courses.cognitiveclass.ai/courses/course-v1:BigDataUniversity+ML0101EN+2016_T3/courseware/407a9f86565c44189740699636b4fb85/d82ba5edac4f40efa334fff96b944b34/

deep learning https://campus.datacamp.com/courses/deep-learning-in-python/basics-of-deep-learning-and-neural-networks?ex=1

Open data - Sources

http://konect.uni-koblenz.de/

https://www.kaggle.com/datasets

Reference

  • "Big data dudes"

Also See

Text Classification with TensorFlow Estimators
https://opendatascience.com/text-classification-with-tensorflow-estimators/