Calculating Averages with Data in Rrds

From Federal Burro of Information
Revision as of 12:24, 2 February 2009 by David (talk | contribs) (New page: So yo have data for many devices and you plot them all in one graphs. but one device has not data for some of the time period you are examining and it blanks out the rest of the devices ...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigationJump to search

So yo have data for many devices and you plot them all in one graphs.

but one device has not data for some of the time period you are examining and it blanks out the rest of the devices in that time period.

This can happen when you agregate router traffic and then add new router. Graphs including dates that predate the new router will have not data.

This can also happen if your colleciton isn't perfect ( who's is? ). You collect data from many targets and want the average. Say DLS lines... you have 600 dsl lines ..but not every client is up all the time. if NAY client is down and the rrd record Nan (Not a number) then the average for all remaining port drops to Nan.

Any normal math on values that include Nan drop to Nan.

I think I've cracked this problem. It's a pretty generic problem and so is the solution.

for each data point calculate two CDEFS:

1. if the valus is Nan , set it to zero, else use it:

datapoint=rawdatapoint,UN,0,rawdatapoint,IF

2. if the value is Nan, set it to zero, else set it to ONE:

datapointvalidflag=rawdatapoint,UN,0,1,IF

for the average first calculate the divisor:

divisor = datapointvalidflag1,datapointvalidflag2,datapointvalidflag3,+,+

then calculate the average:

average=datapoint1,datapoint2,datapoint3,+,+,divisor,\

This falls down only if ALL your datapoints are zero, in which case you're screwed anyway right?

David Thornton