
Intro
For one of our customers our goal lately was to make the logs more actionable. We ran into limitations in sentry-sdk
this is how we tackled those problems.
Background
Logging is an integral part of monitoring for an application. It can be a crucial tool for determining the root cause of a problem. With logging however it is not always easy to know upfront what to log and what not. When you’re writing your code it’s actually useful to add more logging to help you better understand the problem you’re solving. That’s why it’s often best to revisit your logging output and strip out what’s unnecessary. Split the noise from the signal. Better logging leads to better root cause analysis.
From an operations point of view, having verbose logging can obscure problems. The ideal would be to have no errors when everything is “ok”. And when errors occur, your logs would contain clear information pointing to what went wrong.
What we experienced however was that there were two main types of logging that emerged from our application. Application logs and domain logs. With domain logs I mean the log entries that would describe an action that is specific to your business domain. A “payment failed” not a “connection failed”. Domain errors can be errors that are perfectly normal to have. For example the customer didn’t have the required funds to finish his payment. Removing this from your logging is not desirable. You still want to have those log lines so that you can see whether or not there’s an increase in failed transactions after releasing a newer version of you application.
Splitting into two logs
We’re using sentry for this specific application and what we wanted was to make as much of the log lines as actionable as possible. We wanted that when a log line appears, it would be something that requires a developer to look into it.
What we did, was make two different logging endpoints (DSN’s) in sentry. One domain-log endpoint and one application-log endpoint. We ran into a problem with sentry-sdk. It appeared that using multiple DSN’s the logs would appear in only one of them. It’s an open issue on their github project.
We don’t have a good story right now for using multiple clients transparently, although explicitly rebinding clients is something that we do aim to support:
It would have been possible to add extra context information to log events and then filter in the sentry UI. However we opted for splitting into two logging endpoints because this would enable us to forward those events to other tools and take appropriate actions.
Show me the code
We used a solution similar to those suggested in the open issue. We have made our own custom EventHandler
:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
import sys
from sentry_sdk import Client
from sentry_sdk.hub import Hub
from sentry_sdk.integrations.django import DjangoIntegration
from sentry_sdk.integrations.logging import EventHandler
class CustomHandler(EventHandler):
def __init__(self, **kwargs):
if 'release' not in kwargs:
kwargs['release'] = fetch_git_sha(os.getcwd())
if 'integrations' not in kwargs:
kwargs['integrations'] = [DjangoIntegration()]
kwargs['before_send'] = self.before_send
domain_dsn = kwargs.pop('domain_dsn', None)
app_dsn = kwargs.pop('app_dsn', None)
if domain_dsn:
self.domain_client = Client(dsn=domain_dsn, **kwargs)
if app_dsn:
self.app_client = Client(dsn=app_dsn, **kwargs)
else:
print('dsn is required in sentry handler settings', file=sys.stderr)
self.dummy_client = Client(transport=self.send_event)
Hub.current.bind_client(self.dummy_client)
super(CustomHandler, self).__init__(
level=kwargs.get('level', logging.NOTSET)
)
def send_event(self, event):
if event.get('logger') == 'domainlog':
self.domain_client.capture_event(event)
else:
self.app_client.capture_event(event)
As you can see the send_event
handler in our code checks wheter the event was for the domainlog
logger or not. The “default” root logger is used for the “application log”.
So our design looks like this:
1
2
3
4
5
6
7
8
9
10
11
12
13
+----------------+
| |
+------------>+ Domain Log |
| | |
+--------------+ | +----------------+
| | |
| Dummy Client +---------+
| | |
+--------------+ | +----------------+
| | |
+------------>+ Application Log|
| |
+----------------+
After this we went into our code and and for every logger.warning
and every other loggger.<severity>
we checked which logger should be used.
Configuring the Sentry would resemble:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
LOGGING = {
'formatters': {
'verbose': {
'format': '%(levelname)s %(asctime)s %(module)s %(process)d %(thread)d %(message)s'
},
'simple': {
'format': '%(asctime)s %(message)s'
},
}, 'handlers': {
'console': {
'level': 'DEBUG',
'class': 'logging.StreamHandler',
'formatter': 'simple'
},
'sentry': {
# leaving the DSN property empty will disable the handler
'level': 'WARNING',
'class': 'sentry_cust.CustomHandler',
}
}, 'loggers': {
'domainlog': {
'handlers': ['sentry'],
'level': 'WARNING',
}, '': {
'handlers': ['console', 'sentry'],
'level': 'INFO',
'propagate': True,
}
}
}
So in the end in most python files we ended up with to loggers declared:
1
2
3
4
import logging
applog = logging.getLogger(__name__)
domainlog = logging.getLogger('domainlog.' + __name__)
And in the request handlers we now have:
1
2
3
4
5
6
try:
api.finalize_payment(order_session)
except PaymentFailedError as pay_err:
domainlog.error(f'payment failed for user {user_id} with error {pay_err}')
except ConnectionTimeoutError as tim_err:
applog.exception('Could not finish payment for user {user_id} due to a connection error')
TLDR
We’ve split our logs into two DSN’s in Sentry and the code in this article shows how to send your logging to the appropriate handlers.