The Splunkist

Splunk | collect, part 2

Aaron White — Wed, 13 Oct 2021 11:59:44 GMT

Every precaution should be taken to onboard data correctly; nevertheless, there are several instances where it might be necessary to move data from one index to another. When a Splunk admin has access to the file system entire Splunk indexes can be moved, migrated, or copied around with relative ease; however, in the scenario where you are a Splunk Cloud customer or perhaps only need to copy a subset of data from one index to another then using the collect command might be your best option.

In this hypothetical scenario the Splunk admin misspelled the dragos index as dargos:

Let's use the collect command to help this Splunk admin out.

index=dargos sourcetype=dragos_alert source=dragos_alerts
| collect index=dragos sourcetype=dragos_alert source=dragos_alerts

Now the data has been moved to the dragos index and can be searched:

What about my old data?

You have several options:

Delete the old index
Adjust the Searchable time (days) setting to 1 day and wait for your data to age out. You can find this setting under Settings --> Indexes --> select Edit next to your chosen index
Use the delete command
- A note about the delete command: a user must have the delete_by_keyword capability and by default no Splunk role has this capability--not even the admin role. Also, the delete command does not reclaim disk space; rather, it makes the data unsearchable. To learn more about the delete command check out the Splunk docs.

Move a subset of data

What if I don't want to move the entire index, but just a subset of data from an index? The collect command can be used in this scenario as well. In this scenario we will take the WinEventLog:Security events and copy them from the windows index into their own index called oswinsec.

NOTE: Before copying data into the new index, adjust your WinEventLog:Security inputs in Splunk_TA_windows/local/inputs.conf to include the updated index to ensure all new data goes to the new index.

# Splunk_TA_windows/local/inputs.conf
[WinEventLog://Security]
index = oswinsec
disabled = 0

Once you've identified the events that you would like to move and have selected the appropriate time range you are ready to pipe your search to collect:

index=windows sourcetype=WinEventLog source="WinEventLog:Security"
| collect index=oswinsec sourcetype=WinEventLog source="WinEventLog:Security"

Now you can search your data in the new index:

Segregating your operation logs from your security logs can be beneficial if you have different access or storage retention requirements. For example, you could limit access to the oswinsec index to just your security team and configure a 1-year retention on the oswinsec index, and configure a 60-day retention on your operational indexes while allowing everyone the ability to search the operational indexes.

What about my old data?

Assuming you don't want to wait for the old data to age out the only viable option in this scenario is to use the delete command.

A note about license usage

When using the collect command if no sourcetype is specified then it defaults to “stash”. The benefit of using the “stash” sourcetype is the data collected does not count against your license usage. However, that is not feasible in our scenarios as it is necessary to specify a sourcetype and as a result additional license usage will be incurred for ingest based licenses. Depending on your licensing model this may or may not be of concern, but it is best to be aware of the implications. Check out Splunk Cloud's Data Policies or refer to your terms and conditions.

Additional Resources:

Splunk Docs on collect

Splunk | collect, part 1

Aaron White — Mon, 20 Sep 2021 21:46:01 GMT

Perhaps you've recently moved your Splunk stack to the cloud and are wondering how you can enrich Splunk your queries with user information from Active Directory. If this is you follow along as I will show you how to configure the Splunk Supporting Add-on for Active Directory and index your LDAP queries in Splunk cloud.

Prerequisites

A Splunk instance hosted in the cloud
A Splunk Heavy Forwarder on-prem that has access to your Domain Controller
Firewall ports opened between your Heavy Forwarder and Domain Controller on 389/636
An index – if you are not using an existing index then you will need to create an index for this purpose
Splunk Supporting Add-on for Active Directory

Assumptions:

It is assumed that you already have a Splunk Heavy Forwarder configured and forwarding data to Splunk cloud.

Splunk Supporting Add-on for Active Directory

Install Splunk Supporting Add-on for Active Directory

If you haven’t already, download the Splunk Supporting Add-on for Active Directory from Splunkbase.
From the Splunk Web home screen of your Splunk Heavy Forwarder, click the gear icon next to Apps.
Click the Install app from file button.
On the upload screen locate the downloaded file and click Upload.
Restart Splunk when prompted.

Configure Splunk Supporting Add-on for Active Directory

Navigate to Apps and select Splunk Supporting Add-on for Active Directory.
Select the Configuration menu.
Configure the default domain:
1. In the Alternate domain name field, type in an alternate representation of the domain in NetBIOS format. Make sure that the alternate domain name is specified in UPPERCASE format.
  
  Example: SPLUNKU
2. In the Base DN field, type in LDAP notation, the starting point to use when searching for users.
  
  Example: OU=Domain Users,DC=splunku,DC=com
3. In the LDAP Server: Hostname field, type in the name or IP address of the host that the add-on should connect to for this domain.
4. In the LDAP Server: Port field, type in the port that the add-on should connect to on the LDAP server.
5. If you want the server to use SSL to connect, click the SSL checkbox.
6. In the Credentials: Bind DN field, enter the username that the add-on should use to connect to the LDAP server you specified previously, in LDAP notation.
  
  Example: CN=Splunk LDAP,CN=Users,DC=splunku,DC=com
7. In the Credentials: Password field, enter the password for that user.
8. Click the Test connection button to verify your settings.
9. Click Save to save your changes.

When completed your settings should look similar to this:

Craft your LDAP query

Navigate to search and begin crafting your desired LDAP query:

| ldapsearch search="(&(samAccountType=805306368))" attrs="accountExpires, co, department, displayName, distinguishedName, givenName, l, mail, mobile, manager, memberOf, personalTitle, sAMAccountName, sn, st, telephoneNumber, userAccountControl, whenCreated"

Collect

Now pipe your results to collect and provide the desired index:

| collect index=ldap

Save and schedule your search:

Search your data

From your Splunk Cloud search head you can now search the ldap index and should see the same events that were generated from the saved search on the Splunk Heavy Forwarder:

| spath

You'll notice the indexed events are in JSON format and the fields are not extracted. Use the spath command with no arguments, this puts the spath command in "auto-extract" mode which will find and extract all the fields:

A note about spath, sourcetypes, and license usage

When using the collect command if no sourcetype is specified then it defaults to “stash”. The benefit of using the “stash” sourcetype is the data collected does not count against your license usage. When specifying a sourcetype outside of “stash” you will incur license usage. Additionally, when using the “stash” sourcetype for our JSON data the fields are no longer auto extracted. Using “spath” is a quick and easy way to solve this problem at search time; nevertheless, there are other solutions you may wish to consider. One such option is specifying the sourcetype of “json_no_timestamp” when running collect :

| collect index=ldap sourcetype=json_no_timestamp

This properly extracts the fields without any additional work when searching the data; however, the data will count against your license usage. The amount and frequency of data collected will factor into what method to use.

Formatting and outputting

Finally, you can format the data and output it to a lookup table for use in future searches:

index=ldap 
| spath 
| rex field=memberOf "CN=(?[^,]+)" 
| rex max_match=5 field=distinguishedName "OU=(?[^,]+)" 
| eval memberOf=lower(replace(mvjoin(mof_parsed, "|"), " ", "_")), category=lower(replace(mvjoin(dn_parsed, "|"), " ", "_")), priority=case(match(category, "domain_admin|disabled|hold|executive") OR match(memberOf, "domain_admin|enterprise_admin|schema_admin|administrator"), "critical", match(category, "contractor|service_account|external"), "high", match(category, "employee|training|user_account|users|administration"), "medium", 1==1, "unknown"), startDate=strftime(strptime(whenCreated,"%Y%m%d%H%M"), "%m/%d/%Y %H:%M"), "%m/%d/%Y %H:%M"), endDate=strftime(strptime(accountExpires,"%Y-%m-%dT%H:%M:%S%Z"), "%m/%d/%Y %H:%M"), watchlist=if(category IN ("disabled", "hold"), "true", "false"), work_city=mvjoin(mvappend(l, st), ", ") 
| rename sAMAccountName as identity, personalTitle as prefix, displayName as nick, givenName as first, sn as last, mail as email, telephoneNumber as phone, mobile as phone2, manager as managedBy, department as bunit, co AS work_country 
| table identity, prefix, nick, first, last, suffix, email, phone, phone2, managedBy, priority, bunit, category, watchlist, startDate, endDate, work_city, work_country, work_lat, work_long 
| outputlookup identities.csv

Additional Resources:

Splunk Docs on collect

Download Splunk Supporting Add-on for Active Directory

Splunk Supporting Add-on for Active Directory Documentation

Deploy a heavy forwarder

LDAP Syntax Filters

Active Directory Attributes