Strategies and tips for better error grouping in Raygun
Posted Jan 18, 2022 | 8 min. (1549 words)Raygun’s automatic error grouping helps you eliminate noise from repeated errors to keep your team focused. Error grouping is most powerful when it’s configured correctly, and most teams see significant reductions in error counts by making just a few changes. You may need to review your error grouping if there are many similar sounding error groups in your dashboard, for example, five error groups that all read “Script error.” In this article, I’ll go through four key strategies and tips to help you get the most from Raygun’s error grouping logic.
4 ways to improve error grouping
It’s tricky to find a perfect solution to every error grouping scenario — everyone has slightly different ideas of what should be grouped. Not to worry, though. There are usually four major items to check to improve your error grouping:
1. Update to the latest grouping strategy
2. Check changes in the audit log and improve communication
3. Check if you need errors grouped on factors other than the stack trace
4. Reduce error noise with a clear workflow
But before we get into the specifics, one question that always comes up is:
How does Raygun automatically group errors?
Raygun’s error grouping algorithm relies primarily on the stack trace – the origin of the error. In a nutshell, Raygun will group errors originating from the same line of code, and by error type.
Therefore, Raygun Crash Reporting works best when errors are given the appropriate design considerations (especially when it comes to error types) before they are sent into Raygun.
Design considerations are important. We often find that developers, even the experienced ones, don’t put much thought into the error objects that they produce. But putting in a bit of work at the beginning can make a huge difference when errors come into Raygun, so it’s worth it.
Here’s how to write great software error messages.
Now onto the good stuff – how to improve your error grouping quickly.
1. Update to the latest grouping strategy
By far the easiest way to achieve better error grouping is to update to the latest grouping strategy.
An error grouping strategy is the rule set that Raygun is using to group your errors, which we continuously improve and evolve. Running your application on an outdated grouping strategy version means you won’t get all the benefits of the best grouping logic Raygun has to offer. Each time Raygun adds an updated grouping strategy version, we include many new edge cases from the billions of data points we collect. So it’s best to update your grouping strategy regularly to achieve better error grouping.
I spoke with a customer who had first set up their Raygun app years earlier and was unhappy with their current error grouping. It turned out that they were still using v1 of the JavaScript grouping strategy when the latest JavaScript grouping strategy (at the time of writing) is already at v7! Switching to v7 meant that they were able to reduce their error groups by a whopping 83%. This dramatically improves the signal to noise ratio, making it a lot easier to improve your software.
Where can I find the grouping strategy version information?
- Go to your application
- On the sidebar, click on “Settings” under “Crash Reporting”
- Scroll down to the bottom of the page until you see “Classification strategies”
“Hasher version” is the version of the error grouping strategy that you are currently using.
If you see “(Newer version available)” next to the hasher version value like below, it means that you are using an out of date hasher/grouping strategy.
To change/upgrade your error grouping, click on “Change” and follow the prompt on the next page (but make sure you read the recommended steps below before you make any changes).
What are the recommended steps for upgrading/downgrading?
- Talk it over with your team and make them aware (more on this topic later)
- Upgrades will create entirely new groups for your errors, and the current error groups will become invalid. They will still exist, however, no new data will flow into the old error groups. See warning below:
- If you’d still like to proceed, just click “Change” and follow the on-screen instructions
- You will then want to resolve all your existing active errors, so you don’t get a mixture of old and newly grouped active errors. To do this, you will need to go back to the Crash Reporting settings page and use the “Bulk status change” area to resolve or ignore all active errors
All occurring errors will now be grouped based on the new grouping strategy version.
Why won’t Raygun automatically upgrade everyone to the latest version?
Good question! It all comes down to letting our customers decide what’s right for them and their workflow.
There are many ways to use Raygun and we never make any changes to our customers’ data without their express consent.
How to keep up to date with grouping strategy changes?
The best way is to keep an eye on the What’s New section in the sidebar of the Raygun app. This is where you’ll find the latest on any feature releases and tips and tricks to help you make the most of your Raygun subscription.
What are the downsides of upgrading to the latest provider?
- Upgrades will create entirely new groups for your errors. You will lose all historical comments, status history, as well as integration links
- Upgrades will override the advanced custom grouping feature (if using)
- If you make the change without consulting your team, you might be forced to revert it for good reason!
2. Check changes in the audit log and improve communication
Team members can change the grouping strategy, as well as merge similar errors inside the app, on an ad-hoc basis by using the merging feature. Problems can arise if team members introduce these changes without talking it over with the rest of the team.
There might be very good reasons why your company is using an out-of-date grouping strategy, which you might not be aware of. So, if there’s a sudden change in error grouping without adequate internal communication, you can imagine the confusion and chaos this could cause, especially if your error events count is high.
If you are confused about sudden error grouping changes within your plan, the first place to look is the audit log. The audit log provides a way to search and observe past actions by Teams you are part of. The audit log makes it easy to pinpoint who might be responsible for the changes you’re seeing.
One customer I’ve spoken with was unhappy with a particular error group that contained errors that shouldn’t have been grouped. It turned out that his team members had gone ahead and manually merged 463 separate error groups based on a common tag.
This situation was unfortunate, because after merging singular errors into a group, both types of errors appeared under the newly merged group. Recurring errors also appeared in the newly merged group, making this a super error group without a common stack trace – yikes!
3. Check if you need errors grouped on factors other than the stack trace
Raygun’s error grouping algorithm relies primarily on the stack trace – the origin of the error. But some of you might prefer to group errors in other ways to suit your business and workflow.
That’s when the custom grouping feature comes to the rescue.
When there are edge cases that Raygun can’t deal with, you can use the custom grouping feature to apply additional logic on top of what Raygun automatically provides.
Custom grouping is an advanced feature that should be used with caution as there is no undo button. It should only be used when the built-in grouping logic doesn’t quite suit you.
4. Reduce error noise with a clear workflow
If you have a large application with many moving parts, when you first integrate Raygun, you might find that your Crash Reporting dashboard already has thousands of error groups within the first month.
This can be daunting for new customers. The good news is that this is not an error group issue, but rather a workflow issue that you can fix easily.
You can follow this step-by-step workflow guide that we’ve created with one of our customers, OpenWater. They’d been where you were, and this guide details how they now use Raygun on a daily basis to reduce their error noise by 99%.
Need further assistance with better error grouping?
We’re continually working on improving our grouping algorithm to ensure that the correct errors are grouped. It is a difficult problem, so it helps to get some examples from our users to help us improve.
If you’ve tried all of the above methods, understood how Raygun automatically groups errors, and you still need further assistance, we’d like to hear from you. You can get in touch with us at any time by using the “Contact Raygun” link in the sidebar of your Raygun app.
Send us the name of the provider and language you are using and the URLs of the error events that you believe are grouped or separated incorrectly, and one of our engineers will be in touch.