Enhancing logging in Sunbird Desktop App
Background:
Currently, Deskstop App has a logging system in place. That is configurable and allows storing of backups on log rotation. Desktop App stores all type('all' | 'trace' | 'fatal' | 'error' | 'off' | 'info' | 'warn' | 'debug') of logs in app.log file that has a rotation policy of 10 MB and stores compressed backup up to 3 files. And also has a dedicated error.log file that stores only error logs(10 MB | 3 backups). This is built on top of log4js. It is difficult to query based on date and there is no distinction between logs.
Solution:
Logs Categorize:
Sunbird desktop app log can be categorized into different sections based on the task or meaning that it provides. Grouping logs based on categorizes help debug, sync and analyze the different aspects of the Desktop app.
Log Category | Enabled | Syncs to platform | Log storage | Log format | |
---|---|---|---|---|---|
1 | App Install / Update | Enabled by default. | No | Last 3 Logs will be stored | String |
2 | Application | Enabled by default with App log level set to 'INFO'. | No | Last 7 days logs. | String |
3 | Debug | Disabled, can be enabled when needed for a short amount of time | No | Last 3 Logs will be stored | String |
4 | Crash | Enabled by default. | Yes | Last 10 Logs will be stored | String |
5 | Error | Enabled by default. | Yes | Last 500 Logs will be stored | String |
6 | Performance | Enabled by default. | Yes | Last 100 Logs will be stored | JSON |
App Install / Update:
These logs will be generated when the app is being installed on the system(Window or Ubuntu). These logs will be generated by hooking into Electron-builder installation hooks. We can keep 2-3 installation logs.
Application:
All logs generated by the app which has a higher log level than the app log level will be collected here. By default app, log level will be set to 'INFO'.
Debug:
All logs generated by the app irrespective of the app log level will be collected here. This will be enabled based on the user request for a short amount of time. When enable App log level will be set to 'ALL'. This is used when we cant debug from telemetry or from application logs. These logs can be sent when Rasing support tickets in the future.
Crash:
These logs get generated when the app(Electron) crashes. Logs generated will be in minidump format and these logs need to be further processed to do any analysis.
Syncing Crash logs to the platform:
Solution 1: Processing minidump in server
We require an API to sync and batch job process minidumps.
- API to store minidumps: This API will accept minidumps from apps and stores the same in the blob.
- Batch job: This job will retrieve minidumps from the blob and using electron symbols and minidump library will create meaning full data. This data can be stored in Druid or Elastic search for analysis.
Note: This requires an implementation design review.
Solution 2: Processing minidump in the desktop app
We require electron symbols and minidump parsing library to process crash logs. Once processed we can sync the processed data to error sync API.
Error:
All unhandled exceptions and unhandled rejections will be collected in these logs. The app keeps last 500 crash logs and sync to the platform for further processing using the network queue.
Syncing Error logs to the platform:
Sunbird platform has error log aggregation API, same can be used here.
Performance:
This log gets generated for each task/API call etc, capturing time it took to complete, DID and other metrics(CPU, memory usage, etc). Perf logs will be generated for below tasks
- App startup
- Content Import
- Content Export
- Content Download
- Content Delete
- API calls
Syncing Perf log to the platform:
Note: This requires an implementation design review.
Solution 1: Sync to new API.
We require new API sync this logs to the platform and to analyze/visualize.
Solution 2: Log Telemetry metrics event.
We can convert perf log to telemetry metrics event. This will be synced along with all the telemetry events to the platform.
Solution 3: Local analytics.
We can create local analytic with all the perf logs and refer to this when needed.
Logs levels
The desktop app will have all traditional log levels and some custom levels.
Level | Description | |
---|---|---|
1 | INFO | For logging information messages. |
2 | DEBUG | For logging messages for debugging purposes. |
3 | ERROR | For logging errors. |
4 | FATAL | For logging errors that are fatal. |
5 | WARN | For logging warning messages. |
6 | TRACE | For logging messages to help trace errors. |
7 | PERF | For logging all performance information. Like how much time import took, time took for search content, etc. |
8 | DB_ERROR | For logging errors from Database. |
9 | NETWORK_ERROR | For logging network-related errors. |