Concepts
Sampling
Sampling is used to reduce the amount of tracked data being transported to different destinations.
This can help save ingestion and bandwidth costs.
Filtering vs sampling#
Filtering is the concept of allowing only certain events to be tracked and transported to different destinations if they match certain conditions.
Sampling, on the other hand, chooses a percentage of the tracked events to be transported. Meaning, not all users/devices will transport the tracked events, but only a subset of them.
Consistency#
It is recommended to use a consistent value to sample the events by (like userId
attribute), because this will allow maintaining a consistent sampling strategy across different devices and sessions (think web, iOS, and Android apps) for the same user.
Defining sampling#
You can define sampling rules either at events or destinations level.
To keep things simple, we will start with destinations level below:
Simple sampling#
# ...sample: by: attributes.userId percentage: 10 # 10% of unique users
This assumes you already have set a userId
attribute in your application using Eventvisor SDKs.
Sample by custom range#
Instead of a percentage value, you can also be more specific by defining a custom range:
# ...sample: by: attributes.userId range: [0, 50] # first 50% of users
Multiple properties#
Some applications are complex, and you may want to sample based on multiple properties together:
# ...sample: by: - attributes.organizationId - attributes.userId percentage: 10
First available property#
You may also choose to sample based on user's ID if available, otherwise falling back gracefully to device ID.
In that case, you can use the first available value:
# ...sample: by: or: - attributes.userId - attributes.deviceId percentage: 10
Conditional sampling#
You can also define multiple sampling rules based on different conditions. The first rule to match will be applied:
# ...sample: - conditions: - attribute: platform operator: equals value: web by: attributes.deviceId percentage: 10 # ...more rules here
Different sources#
We used attributes for the sampling strategies above, but you are not limited to them alone. You can use other sources as well:
# ...sample: by: # one of the properties below (not ALL together!) attribute: userId # or a lookup lookup: localstorage.userId percentage: 10
Event level sampling#
Instead of destinations alone, you can also define sampling rules on a per-event basis:
# ...sample: by: attributes.userId percentage: 10
Overriding destinations#
If for a certain event, you wish to define sampling rules targeting a specific destination only:
# ...destinations: browser: sample: by: attributes.userId percentage: 10
If there's any event-level sampling defined, it will take priority over the destination-level sampling rules.
It is highly recommended that sampling is done at destination level, and event-level can be used for more complex use cases.