- Log everything you need for debugging as part of a possible post modem on the target machine but do not aggregate or ship the logs.
- make certain each log entry is unique with it’s PID or something
- aggregate duplicate log entries and decide on the max dupe count before writing a sentinel
- When an actionable event occurs send a message to the monitoring server
- When an event needs to be monitored send that event to the monitoring server immediately. This is usually an indication that the transaction is either beginning or ending; or some critical timing piece like an external service.
- I like to perform a stack trace as the transaction progresses storing the data locally until it completes then use a low priority service to copy it to storage or a reporting server. I like to send the stack trace to the server when the transaction complete.
Logstash and elasticsearch are interesting tools but they are not without their challenges. Once you get to talking about scaling and capacity issues logging is only going to get worse. It never seems to improve.