Exploring OM Splunk Data
What are agent states for UCCE and their stats?
index="main" earliest="-2h" source="mediabar" "network.his.model"=RNA-CiscoUCCE | top mb.agent.state by network.his.model | fields - network.his.model
Results:
mb.agent.state | count | percent |
---|---|---|
Handling interaction | 514840 | 73.018975 |
Unavailable | 108028 | 15.321447 |
Available | 43564 | 6.178616 |
Mixed states | 18121 | 2.570074 |
Connected to Harmony | 12569 | 1.782642 |
Wrap up | 7955 | 1.128246 |
How to identify, at a high level, the major components in use by the customer?
index="main" earliest="8/3/2020:06:00:00" latest="8/3/2020:06:30:00" source="mediabar" | eval crmcust='crm.customer' | eval agent='crm.id' | eval class='mb.className' . "-" . 'mb.functionName' | search crmcust="*" agent="*" class="*" | stats values(class) as lc, count(class) as cc by crmcust, agent | where ((crmcust="veritas" AND cc > 1850) OR (crmcust="chewy" AND cc > 400) OR (crmcust="arval" AND cc > 1200))
How to convert Splunk events to look like the regular HIS/CS log statements I am used too?
| tstats count WHERE (index="main" earliest="8/1/2020:06:00:00" latest="8/1/2020:06:30:00" source="mediabar" host="https://chewy.custhelp.com" ) BY _time logLevel crm.instanceId crm.groupId crm.id mb.className mb.functionName message span=1s
Currently, the majority of searches are centered around component names, ‘mb.className' and 'mb.functionName’, and string matching. With some familiarity, at quick glance simply of a component, it can be easily determined if an agent is getting screen pops from Harmony or not.
Components by Version
Logging design is still undergoing changes, so the component names ('mb.className' and 'mb.functionName') can vary by version.
| tstats count WHERE (index="main" earliest="8/17/2020:08:00:00" latest="8/18/2020:08:00:00" source="mediabar") BY mb.version mb.className mb.functionName | eval major=mvindex(split('mb.version', "."), 0) | eval class='mb.className' . "-" . 'mb.functionName' | search class="*" | stats values(class) as lc, count(class) as cc by major
Simplify Log-Style Statements to Fewer Fields (save screen real-estate)
Show fewer fields in our log-style statements, in this case for one agent. This demonstrates using a sub-search (starts with left bracket '[') for an inexpensive query and passing the result back to the main search pipe.
| tstats count WHERE (index="main" earliest="-24h" host="https://faq.arval.it" [ | tstats count WHERE (index="main" earliest="-24h" host="https://faq.arval.it") BY crm.id | top limit=1 crm.id| rename count as c | rename percent as p | fields - c p | format]) BY _time logLevel crm.id mb.className mb.functionName message span=1s | eval class='mb.className' . "-" . 'mb.functionName' | search class="*" | table _time logLevel crm.id class message
Significant 'mb.className' and 'mb.functionName' Values
HIS | PF | QA |
---|---|---|
OmisService | PopflowRuntimeService | QueueAdapterService |
login, startSession, endSession, bind*, screenPopChat, updateAgentStatus, screenPop | run, executePopFlow, nextActivity, activityMapper, orkflowQueue.subscribe | connect, disconnect, popChat, acceptEmail, chatComplete, chatConcludeRelease |
How to filter data by production versus lower environments?
This query shows how to distinguish between production and lower environments and is also the simplest form of unique agent logins by host (customer URL).
| tstats distinct_count(crm.username) as agents_dc_per_h WHERE (index="main" earliest="-1h@h" [|inputlookup spl-customer-host.csv | where cloudenv="prod" | fields displaycustomer hostlookup | lookup spl-customer-host.csv displaycustomer cloudenv OUTPUT hostlookup | fields - displaycustomer | rename hostlookup as host | format]) BY _time host span=1h
List of customers and their sites/versions?
| tstats count WHERE (index="main" earliest=-5d@d latest=now source="mediabar") BY crm.customer mb.version network.appManagerDomain network.crmHost | sort mb.version DESC | dedup network.crmHost | table crm.customer mb.version network.crmHost | sort crm.customer
The Splunk LookupTable that made the previous query possible:
| inputlookup spl-customer-host.csv | WHERE NOT (displaycustomer in ("omdemo", "omdev", "omqa", "omtrain") ) | dedup displaycustomer | lookup spl-customer-host.csv displaycustomer OUTPUT crmcustomer cloudenv hostlookup
Popflow Events
How to identify Popflow and how it is being used, aka "Popflow Overview"?
| tstats count WHERE (index="main" earliest="8/17/2020:08:00:00" latest="8/18/2020:00:00:00" "mb.className"=PopflowRuntimeService (( message="*Event '*' *ed" ) OR ( message="*Activity complete*" ) OR ( message="*Starting Activity*" ) OR ( message="*Activity event*" ) OR ( message="*Got*popflow*" AND message!="*Got* 0*" ) OR ( message="*Getting*" )) ) BY _time logLevel crm.customer crm.instanceId crm.groupId crm.id mb.className mb.functionName message span=1s | rex field=message "^(?<mytitle>[^{\n]*)(?P<myjson>{.*})" | eval jsonctx=substr(myjson, 1, 40), msgctx=substr(message, 1, 40) | eval class='mb.className' . "-" . 'mb.functionName', crmgroup='crm.instanceId' . "-" . 'crm.groupId'| search crmgroup="*" class="*" | table _time logLevel crm.customer crmgroup crm.id class mytitle jsonctx msgctx
Explanation:
- The use of ‘| search crmgroup="" class=”*”’ clause and all the string matching is, as previously described, because we are still dependent on string matching and class names.
- The field ‘msgctx’ is present for context and would be used in the case where we remove the filter ‘mb.className’=PopflowRuntimeService.
- You see we are trying to populate ‘mytitle’ and ‘jsonctx’ fields in this way for 2 reasons: if they are blank it may indicate there is a message we aren't expecting and/or. parsing incorrectly; collapsing 2 fields down to 1 is simply for saving space to view the ‘message’ field without scrolling.
- One of the most important statements in this query is the use of regular expressions (pattern matching):
| rex field=message "^(?<mytitle>[^{\n]*)(?P<myjson>{.*})"
there is a page dedicated to tools for pattern match and JSON manipulation for Splunk, keep checking back for updates.
From here, we are going to keep building upon the Popflow Overview, extract some new information, until we have a fully populated breakdown of the events.
There are workflows authored to act off events and "event detected" messages, which can have a corresponding action to fetch a workflow as "getting popflow for eventId" messages, followed by a "got popflow" message which loads workflow and starts to run activities of different types and tracks "starting activity" and "activity complete" messages.
What Popflow Activities are Being Run?
In overall product usage tracking, I like to track workflows being run and the number of instructions (aka Activities) as an overall indicator of scale and volume. But let’s start with an Activity overview in a log-format style.
Normalize the Data, Put Events, Popflow Scripts, and Activities Together
This document may contain confidential and/or privileged information belonging to OpenMethods. If you are not the intended recipient (or have received this document in error) please notify the sender immediately and destroy this document. Any unauthorized copying, disclosure, or distribution of the material in this document is strictly forbidden.