GTALUG Intelligent Availability: The Evolution of High Availability 2017-06-13

From Federal Burro of Information
Jump to navigationJump to search

GTALUG Intelligent Availailbity: the evolution of high availability

@alteeve

"Intelligent Availability"

Redhat cluster stack?

What does HA mean?

re-acts soft + hardware can recover has redundant hyper converged

storage compute net integrate

HC is a our utilization not service stability

Tools:

  • Cman and RGamanger
  • pacemaker and cman

knowing the risk -> likelihood / cost

"It's not about service stability."

intelligent availability gold standard: - proactive not reactive - complete stack redundancy

suviv failure is not enouhg , recover aswell.
replace a drvier before it fails.
 is this about an accoutning excervise
"the ideal would be stack redunandcy all the way down"
"no down time in replacement"
"we have to accept that when we design things, it will failure."
 response - yeah but you don't design against all failure , not cost effective.

"world of goo"

drbr

distr replicated block device

you don't know about how it will fail.

what step need to happen to restore services? - maintenance window? - restart services?

if it doesn't have to be dependent , then don't make it defendant

labeling cables

usability , supportability

even if you are iso 9001 ... iso 9001 and not about not making mistakes.

being proactive:

no surprises
monitoring feeds you info, you still have to act on it.
"automated decision maker" - "scancore" is the proprietary decision making engine
restart on fault - systemd

dont think about failure states

depandacy graphing
knonw and unknown things
assumes a priori inofmation about depandcies
collecting data and then using that to learn.