At its core, Splunk is about delivering Statistics so there are a few things about Statistics that impact how you would structure search commands, interpret results, etc. This is not even enough about statistics to be dangerous.


Choosing a Time Window

There is a clarification needed that states your method of calculating usage and concurrency numbers which is really about the principles of statistics but greatly affects the numbers.

 I’ll state what I think we should do up front, namely for usage compliance related items, and then explain why,

  • For Unique Users calculation, we grab all the data for the Month at once and run Unique(by 1 entire month)

  • For Simultaneous calculation, we grab data for 1 hour periods, Simultaneous(by 1 hour),  then for the Month it is Max( Simultaneous(by 1 hour) for all hours of the month).

Our data changes frequently and a Concurrency calculation depends highly on those changes.  Whereas a Unique calculation doesn’t care about rate of change and just needs all the settled data.   

 In general. for statistics, you grab data from a fixed length of time (a Window) and perform some calculation.  Therefore, for your time period of interest (Duration),

    Time period of interest (Duration) divided by Window length = No. of Windows 

Let’s say you have this:

For Unique Users you want Max for entire period to be:   3 

For Simultaneous you want to catch as much activity as possible and user the smaller window.  

  • In time window 1 max simultaneous: 1

  • In time window 2 max simultaneous: 2

  • Thus, overall your Max simultaneous for the whole period is:  2