Overview
This article will discuss the Splunk Search App and Splunk query language (SPL).
1. Using the Search App
Open the Search App
From Splunk Home, click Search & Reporting in the Apps panel. This opens the Search Summary view in the Search & Reporting app.
Search Summary view
Before running any searches, the search summary view, Figure 1. below, has common elements that will be present when using the app.
Figure 1. Search App Summary View |
Element No. | Element | Description | Additional Info |
---|---|---|---|
1 | Applications menu | Switch between Splunk applications that you have installed. The current application, Search & Reporting app, is listed. This menu is on the Splunk bar. | To get back to the search home from any view that has the Splunk bar, click the “App Search & Reporting” menu and choose “Search & Reporting” (the Splunk app) |
2 | Apps bar | Navigate between the different views in the application you are in. For the Search & Reporting app the views are: Search, Datasets, Reports, Alerts, and Dashboards. | |
3 | Search bar | Specify your search criteria. | |
4 | Time range picker | Specify the time period for the search, such as the last 30 minutes (default). |
2. Before Running a Search
Before running a search, here are a few key concepts to understand Splunk search language.
Indexes
From the https://openmethodsdev.atlassian.net/wiki/spaces/OKB/pages/1161297960 page, by now it is understood the Splunk Indexers have indexed the data and made those indexes available for search (by index name).
Fields
Also, as data was indexed, fields are extracted and made available for search.
Time-Series Events (streaming)
Splunk processes incoming data as time-series events (or metrics which will be a future topic) and makes those events available in a streaming fashion.
The query process for streaming events is very different than the query process for classic SQL or No-SQL data stores. In terms of the query language, common terms such as ‘WHERE’ or ‘BY’ or ‘sort’ can be seen in both languages, however, the logic approach of getting and processing data in streams and transforming and forming the data in multiple passes as events stream flow through a data pipe is quite different.
Types of Search Commands
Search commands are streaming and non-streaming. Non-streaming commands are also dataset processing commands (e.g., ‘sort’ command), it just means it needs the entire dataset before processing the command
Streaming commands are commands that operate on events as they are returned by the search.
Note: the search may operate on events based on the time recorded in the event package sent to Splunk or based on the time order of when the event was processed into the index. Watch for this as you explore search commands.
Streaming Types of Commands
Streaming commands are preferred as they are more resource efficient, more performant. A non-streaming command (that needs the entire dataset) obviously has to pause to collect that entire dataset and store it in memory and then operate on a larger set of data.
Distributable: streaming order of events don’t matter (e.g., ‘rex’ command which is Splunk’s regex command).
Centralized: order of events matter, that is, it transforms events in order.
Transforming: orders results into a dataset by transforming each event, statistical.
Generating: fetches event from indexes without any transformation, that is, it fetches data from the indexes and starts the search data stream, may be event or report generating.
2.4 Structure of a Search
The structure of a search is illustrated below in Figure 2.
Figure 2. Structure of a search |
The first command must be a ‘generating’ command. The commands in this primer are: ‘search’, ‘tstats’, and ‘makeresults’.
The results of one command are passed to another command with a '|' (pipe) symbol. This is streaming output of one command to the next. There can be (0…N) data processing commands.
The last segment is a method to “display” results visually or in a table.
3. Search Screen Overview
The following is a discussion of search results screen.
Per the above section on “Structure of a search”, the first command must be "generating". Then, we need some fields to search on.
The standard built-in fields for any Splunk deployment are: ‘host’, ‘source’, ‘sourcetype’, and ‘index’.
‘index’ isn't easy to remember at first,
‘sourcetype’ isn’t currently meaningful because we are sending only one event structure at this time with ‘sourcetype’=”_json” (JSON over https per the architecture diagram in the first article in this series).
The primary “generating” command is ‘search’, therefore it is the default and implied. To run a search,
Ensure the time range picker is set to “Last 30 minutes”
In the search bar, type in: search source=”*”; press enter, the search command is implied so the search bar runs the search and changes it to source=”*”.
Figure 4. shows the results of the search.
Figure 3. |
Element No. | Element | Description |
---|---|---|
1 | Search bar | Specify your search criteria. |
2 | Time range picker | Specify the time period for the search, such as the last 30 minutes (default). |
3 | Search action buttons | Actions that can be performed on the search, primarily used for 2 purposes: working with a search job such as sending long running job to a background job; print or export results to CSV. |
4 | Search results tabs | The tab that your search results appear on depends on your search. Some searches produce a set of events, which appear on the Events tab. Other searches transform the data in events to produce search results, which appear on the Statistics tab. These are very important late for working with searches. |
5 | Timeline | A visual representation of the number of events that occur at each point in time. |
6 | Fields sidebar | Displays a list of the fields discovered in the events. The logic for fields to show in SELECTED FIELDS section is built-in fields and fields that appear in 100% of the population of events. The fields in INTERESTING FIELDS are fields extracted at index time that appear in the highest percentage of events in the results set. |
7 | 1 of 3 events displayed in results | This event has been expanded in the view pane just enough to inspect the ‘message’ field. Events are sent to Splunk as JSON over https and ‘message' field was not extracted during indexing and thus as in 'inner’ JSON contains important data to OpenMethods result sets. |
8 | 2 of 3 events displayed in results | This event was formatted by clicking the link associated with each event “Show as raw text”. The raw text, shown in another built-in field called ‘_raw’, is exactly how we sent it into Splunk from the MediaBar. |
9 | 3 of 3 events displayed in results | This JSON of this event was expanded out just enough to reveal some of the most important fields in our dataset: ‘crm’ data such as ‘crm.customer’, mediabar, ‘mb’ data such as ‘mb.agent’ and agent details, ‘mb.className’ and ‘mb.functionName’ which indicate which component/function/service this event might be related to (popflow, mediabar, request/respose from HIS, QA, CS). |
3.1 More on “Fields sidebar”
In the Fields sidebar, additional insight can be obtained from the statistics related to how many times a field has a non-null value in the current result set and what are those non-null values. This is a useful method for exploring the fields, gaining insight on the field values, and thus what to look for in your next query.
At the bottom of the Fields sidebar in Figure 3, there is a link “21 additional fields”. Upon clicking that it will open a dialog that shows the statistics for all fields.
The UI also supports mouse-over/hover. Hover over/click the ‘source’ field in the SELECTED FIELDS section and the view in Figure 5 below is shown.
Figure 4. Statistics popup for ‘source’ | The statistics sample of the ‘source’ field shows ‘MediaBar’, ‘App Manager’, and ‘ConnectAPI PROD’. ‘MediaBar’ is the entire focus of these wiki articles. The other sources are in early stage. All events are by default sent to ‘index’=”main”, thus the baseline “Generating” command when should always include: index=”main” source=”MediaBar” (it is case-insensitive by default). |
Use Search for Field statistics
We can use the Splunk search language to generate the statistics for ‘source’ ourselves.
source="*" | top source
source count percent
MediaBar 229564 99.674790
App Manager 749 0.325210
230,313 events (8/10/20 8:35:00.000 PM to 8/10/20 9:05:00.000 PM
Events (230,313)
Statistics (2)Why is it different than the statistics for Figure 3 and Figure 4 above? We will answer that in a separate article.