Skip to main content

Documentation Index

Fetch the complete documentation index at: https://cantonfoundation-generated-hydration-fix.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

This section describes observability features of PQS, which are designed to help you monitor health and performance of the application.

Approach to observability

PQS opted to incorporate OpenTelemetry APIs to provide its observability features. All three sources of signals (traces, metrics, and logs) can be exported to various backends by providing appropriate configuration defined by OpenTelemetry protocols and guidelines. This makes PQS flexible in terms of observability backends, allowing users to choose what fits their needs and established infrastructure without being overly prescriptive. To have PQS emit observability data, an OpenTelemetry Java Agent must be attached to the JVM running PQS. OpenTelemetry’s documentation page on Java Agent Configuration1 has all the necessary information to get started. As a frequently requested shortcut (only metrics over Prometheus exposition endpoint embedded by PQS), the following snippet can help you get started. For more details, refer to the official documentation:
$ export OTEL_SERVICE_NAME=pqs
$ export OTEL_TRACES_EXPORTER=none
$ export OTEL_LOGS_EXPORTER=none
$ export OTEL_METRICS_EXPORTER=prometheus
$ export OTEL_EXPORTER_PROMETHEUS_PORT=9090
$ export JDK_JAVA_OPTIONS="-javaagent:path/to/opentelemetry-javaagent.jar"
$ ./scribe.jar pipeline ledger postgres-document ...
PQS Docker images already come pre-configured this way, but users are free to override these values as they see fit for their environments.

Logging

Log level

Set log level with --logger-level. Possible value are All, Fatal, Error, Warning, Info (default), Debug, Trace:
--logger-level=Debug

Per-logger log level

Use --logger-mappings to adjust the log level for individual loggers. For example, to remove Netty network traffic from a more detailed overall log:
--logger-mappings-io.netty=Warning \
--logger-mappings-io.grpc.netty=Trace

Log pattern

With --logger-pattern, use one of the predefined patterns, such as Plain (default), Standard (standard format used in DA applications), Structured, or set your own. Check Log Format Configuration2 for more details. To use your custom format, provide its string representation, such as:
--logger-pattern="%highlight{%fixed{1}{%level}} [%fiberId] %name:%line %highlight{%message} %highlight{%cause} %kvs"

Log format for console output

Use --logger-format to set the log format. Possible values are Plain (default) or Json. These formats can be used for the pipeline command.

Log format for file output

Use --logger-format to set the log format. Possible values are Plain (default), Json, PlainAsync and JsonAsync. They can be used for the interactive commands, such as prune. For PlainAsync and JsonAsync, log entries are written to the destination file asynchronously.

Destination file for file output

Use --logger-destination to set the path to the destination file (default: output.log) for interactive commands, such as prune.

Log format and log pattern combinations

  • Plain / Plain
    00:00:23.737 I [zio-fiber-0] com.digitalasset.scribe.pipeline.pipeline.Impl:34 Starting pipeline on behalf of 'Alice_1::12209982174bbaf1e6283234ab828bcab9b73fbe313315b181134bcae9566d3bbf1b'  application=scribe
    00:00:24.658 I [zio-fiber-0] com.digitalasset.scribe.pipeline.pipeline.Impl:61 Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'  application=scribe
    00:00:25.043 I [zio-fiber-895] com.digitalasset.zio.daml.ledgerapi.package:201 Contract filter inclusive of 1 templates and 0 interfaces  application=scribe
    00:00:25.724 I [zio-fiber-0] com.digitalasset.scribe.pipeline.pipeline.Impl:85 Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'  application=scribe
    
  • Plain / Standard
    component=scribe instance_uuid=5f707d27-8188-4a44-904e-2f98ee9f4177 timestamp=2024-01-16T23:42:38.902+0000 level=INFO correlation_id=tbd description=Starting pipeline on behalf of 'Alice_1::1220c6d22d46d59c8454bd245e5a3bc238e5024d37bfd843dbad6885674f3a9673c5'  scribe=application=scribe
    component=scribe instance_uuid=5f707d27-8188-4a44-904e-2f98ee9f4177 timestamp=2024-01-16T23:42:39.734+0000 level=INFO correlation_id=tbd description=Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'  scribe=application=scribe
    component=scribe instance_uuid=5f707d27-8188-4a44-904e-2f98ee9f4177 timestamp=2024-01-16T23:42:39.982+0000 level=INFO correlation_id=tbd description=Contract filter inclusive of 1 templates and 0 interfaces  scribe=application=scribe
    component=scribe instance_uuid=5f707d27-8188-4a44-904e-2f98ee9f4177 timestamp=2024-01-16T23:42:40.476+0000 level=INFO correlation_id=tbd description=Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'  scribe=application=scribe
    
  • Plain / Custom
    --logger-pattern=%timestamp{yyyy-MM-dd'T'HH:mm:ss} %level %name:%line %highlight{%message} %highlight{%cause} %kvs
    
    2024-01-16T23:55:52 INFO com.digitalasset.scribe.pipeline.pipeline.Impl:34 Starting pipeline on behalf of 'Alice_1::1220444f494b31c0a40c2f393edac3f5900325028c6f810a203a0334cd830ec230c8'  application=scribe
    2024-01-16T23:55:53 INFO com.digitalasset.scribe.pipeline.pipeline.Impl:61 Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'  application=scribe
    2024-01-16T23:55:53 INFO com.digitalasset.zio.daml.ledgerapi.package:201 Contract filter inclusive of 1 templates and 0 interfaces  application=scribe
    2024-01-16T23:55:53 INFO com.digitalasset.scribe.pipeline.pipeline.Impl:85 Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'  application=scribe
    
  • Json / Standard
    {"component":"scribe","instance_uuid":"03c263a0-6e3d-416e-b7f2-0e56b9e34841","timestamp":"2024-01-17T00:04:12.537+0000","level":"INFO","correlation_id":"tbd","description":"Starting pipeline on behalf of 'Alice_1::1220f03ed424480ab4487d88230fc033f3910f4cb4492fea68535a5760744b53dabe'","scribe":{"application":"scribe"}}
    {"component":"scribe","instance_uuid":"03c263a0-6e3d-416e-b7f2-0e56b9e34841","timestamp":"2024-01-17T00:04:13.551+0000","level":"INFO","correlation_id":"tbd","description":"Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'","scribe":{"application":"scribe"}}
    {"component":"scribe","instance_uuid":"03c263a0-6e3d-416e-b7f2-0e56b9e34841","timestamp":"2024-01-17T00:04:13.935+0000","level":"INFO","correlation_id":"tbd","description":"Contract filter inclusive of 1 templates and 0 interfaces","scribe":{"application":"scribe"}}
    {"component":"scribe","instance_uuid":"03c263a0-6e3d-416e-b7f2-0e56b9e34841","timestamp":"2024-01-17T00:04:14.659+0000","level":"INFO","correlation_id":"tbd","description":"Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'","scribe":{"application":"scribe"}}
    
  • Json / Structured
    {"timestamp":"2024-01-17T00:08:25+0000","level":"INFO","thread":"zio-fiber-0","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:34","message":"Starting pipeline on behalf of 'Alice_1::122077c6b00e952ff694e2b25b6f5eb9582f815dfe793e2da668b119481a1dd5acdc'","application":"scribe"}
    {"timestamp":"2024-01-17T00:08:26+0000","level":"INFO","thread":"zio-fiber-0","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:61","message":"Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'","application":"scribe"}
    {"timestamp":"2024-01-17T00:08:26+0000","level":"INFO","thread":"zio-fiber-882","location":"com.digitalasset.zio.daml.ledgerapi.package:201","message":"Contract filter inclusive of 1 templates and 0 interfaces","application":"scribe"}
    {"timestamp":"2024-01-17T00:08:26+0000","level":"INFO","thread":"zio-fiber-0","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:85","message":"Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'","application":"scribe"}
    
  • Json / Custom
    --logger-pattern=%label{timestamp}{%timestamp{yyyy-MM-dd'T'HH:mm:ss}} %label{level}{%level} %label{location}{%name:%line} %label{description}{%message} %label{cause}{%cause} %label{scribe}{%kvs}
    
    {"timestamp":"2024-01-17T00:16:31","level":"INFO","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:34","description":"Starting pipeline on behalf of 'Alice_1::1220ee13431ac437d454ea59d622cfc76599e0846a3caf166b4306d47b1bf83944a6'","scribe":{"application":"scribe"}}
    {"timestamp":"2024-01-17T00:16:33","level":"INFO","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:61","description":"Last checkpoint is absent. Seeding from ACS before processing transactions with starting offset '00000000000000000b'","scribe":{"application":"scribe"}}
    {"timestamp":"2024-01-17T00:16:34","level":"INFO","location":"com.digitalasset.zio.daml.ledgerapi.package:201","description":"Contract filter inclusive of 1 templates and 0 interfaces","scribe":{"application":"scribe"}}
    {"timestamp":"2024-01-17T00:16:35","level":"INFO","location":"com.digitalasset.scribe.pipeline.pipeline.Impl:85","description":"Continuing from offset '00000000000000000b' and index '0' until offset '00000000000000000b'","scribe":{"application":"scribe"}}
    
    Notice you need to use %label{your_label}{format} to describe a Json attribute-value pair.

Application metrics

Assuming PQS exposes metrics as described above, you can access the following metrics at http://localhost:9090/metrics. Each metric is accompanied by # HELP and # TYPE comments, which describe the meaning of the metric and its type, respectively. Some metric types have additional constituent parts exposed as separate metrics. For example, a histogram metric type tracks max, count, sum, and actual ranged buckets as separate time series. Metrics are labeled where it makes sense, providing additional context such as the type of operation or the template/choice involved. Conceptual list of metrics (refer to actual metric names in the Prometheus output):
TypeNameDescription
gaugewatermark_ixCurrent watermark index (transaction ordinal number for consistent reads)
counterpipeline_events_totalProcessed ledger events
histogramjdbc_conn_useLatency of database connections usage
histogramjdbc_conn_isvalidLatency of database connection validation
histogramjdbc_conn_commitLatency of database connection commit
histogramtotal_tx_handling_latencyTotal latency of transaction handling in PQS (observed in LAPI to committed in DB)
gaugetx_lag_from_ledger_wallclockLag from ledger (wall-clock delta (in ms) from command completion to receipt by pipeline)
histogrampipeline_convert_acs_eventLatency of converting ACS events
histogrampipeline_convert_transactionLatency of converting transactions
histogrampipeline_prepare_batch_latencyLatency of preparing batches of statements
histogrampipeline_execute_batch_latencyLatency of executing batches of statements
histogrampipeline_progress_watermark_latencyLatency of watermark progression
histogrampipeline_wp_acs_events_sizeNumber of in-flight units of work in pipeline_wp_acs_events wait point
histogrampipeline_wp_acs_statements_sizeNumber of in-flight units of work in pipeline_wp_acs_statements wait point
histogrampipeline_wp_acs_batched_statements_sizeNumber of in-flight units of work in pipeline_wp_acs_batched_statements wait point
histogrampipeline_wp_acs_prepared_statements_sizeNumber of in-flight units of work in pipeline_wp_acs_prepared_statements wait point
histogrampipeline_wp_events_sizeNumber of in-flight units of work in pipeline_wp_events wait point
histogrampipeline_wp_statements_sizeNumber of in-flight units of work in pipeline_wp_statements wait point
histogrampipeline_wp_batched_statements_sizeNumber of in-flight units of work in pipeline_wp_batched_statements wait point
histogrampipeline_wp_prepared_statements_sizeNumber of in-flight units of work in pipeline_wp_prepared_statements wait point
histogrampipeline_wp_watermarks_sizeNumber of in-flight units of work in pipeline_wp_watermarks wait point
counterpipeline_wp_acs_events_totalNumber of units of work processed in pipeline_wp_acs_events wait point
counterpipeline_wp_acs_statements_totalNumber of units of work processed in pipeline_wp_acs_statements wait point
counterpipeline_wp_acs_batched_statements_totalNumber of units of work processed in pipeline_wp_acs_batched_statements wait point
counterpipeline_wp_acs_prepared_statements_totalNumber of units of work processed in pipeline_wp_acs_prepared_statements wait point
counterpipeline_wp_events_totalNumber of units of work processed in pipeline_wp_events wait point
counterpipeline_wp_statements_totalNumber of units of work processed in pipeline_wp_statements wait point
counterpipeline_wp_batched_statements_totalNumber of units of work processed in pipeline_wp_batched_statements wait point
counterpipeline_wp_prepared_statements_totalNumber of units of work processed in pipeline_wp_prepared_statements wait point
counterpipeline_wp_watermarks_totalNumber of units of work processed in pipeline_wp_watermarks wait point
counterapp_restarts_totalTracks number of times recoverable failures forced the pipeline to restart
gaugegrpc_upIndicator whether gRPC channel is up and operational
gaugejdbc_conn_pool_upIndicator whether JDBC connection pool is up and operational

Grafana dashboard

Based on the metrics described above, it is possible to build a comprehensive dashboard to monitor PQS. Vendor-supplied Grafana dashboard for PQS can be downloaded from artifacts repository (see pqs-download). You may want to refer to this as a starting point for your own.
grafana/v9.4.0/dashboard.json
grafana/v10.4.0/dashboard.json
grafana/v11.0.0/dashboard.json
image

Health check

The health of the PQS process can be monitored using the health check endpoint /livez. The health check endpoint is available on the configured network interface (--health-address) and TCP port (--health-port). Note the default is 127.0.0.1:8080.
$ curl http://localhost:8080/livez
{"status":"ok"}

Tracing of pipeline execution

PQS instruments the most critical parts of its operations with tracing to provide insights into the execution flow and performance. Traces can be exported to various OpenTelemetry backends by providing appropriate configuration, for example:
$ export OTEL_TRACES_EXPORTER=otlp
$ export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
$ export OTEL_EXPORTER_OTLP_ENDPOINT="http://otel-collector:4317"
$ export JDK_JAVA_OPTIONS="-javaagent:path/to/opentelemetry-javaagent.jar"
$ ./scribe.jar pipeline ledger postgres-document ...
The following root spans are emitted by PQS:
span namedescription
process metadata and schemainteractions that happen when PQS starts up and ensures its datastore is ready for operations
initialization routineinteractions that happen when PQS establishes its offset range boundaries (including seeding from ACS if requested) on startup
consume com.daml.ledger.api.v1.TransactionService/GetTransactions consume com.daml.ledger.api.v1.TransactionService/GetTransactionTrees[Daml SDK v2.x] timeline of processing a ledger transaction from delivery over gRPC to its persistence to datastore
consume com.daml.ledger.api.v2.UpdateService/GetUpdates consume com.daml.ledger.api.v2.UpdateService/GetUpdateTrees[Daml SDK v3.x] timeline of processing a ledger transaction from delivery over gRPC to its persistence to datastore
execute datastore transactioninteractions when a batch of transactions is persisted to the datastore
advance datastore watermarkinteractions when the latest consecutive watermark is persisted to the datastore
All spans are enriched with contextual information through OpenTelemetry’s attributes and events where appropriate. It is advisable to get to know this contextual data. Due to the technical nature of asynchronous and parallel execution, PQS heavily employs span links3 to highlight causal relationships between independent traces. Modern trace visualisation tools leverage this information to provide a usable representation and navigation through the involved traces. Below is an example of causal trace data that spans receipt of a transaction from the Ledger API all the way to it becoming visible by PQS’ SQL API in Postgres. image
Span #110
Trace ID       : 042ce1ffa24b34b38472933ac8209d54
Parent ID      :
ID             : d5c0071e1d9bbf76
Name           : consume com.daml.ledger.api.v1.TransactionService/GetTransactionTrees
Kind           : Consumer
Start time     : 2024-11-06 03:16:43.004004 +0000 UTC
End time       : 2024-11-06 03:16:43.004193 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> messaging.operation.name: Str(consume)
     -> messaging.batch.message_count: Int(1)
     -> messaging.destination.name: Str(com.daml.ledger.api.v1.TransactionService/GetTransactionTrees)
     -> messaging.system: Str(canton)
     -> messaging.operation.type: Str(process)

Span #123
Trace ID       : 042ce1ffa24b34b38472933ac8209d54
Parent ID      : d5c0071e1d9bbf76
ID             : 9d60e1f4c42dce76
Name           : export transaction tree
Kind           : Internal
Start time     : 2024-11-06 03:16:43.004134 +0000 UTC
End time       : 2024-11-06 03:16:43.024574 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> daml.effective_at: Str(2024-11-06T03:16:42.827847Z)
     -> daml.command_id: Str(3563113460)
     -> daml.events_count: Int(3)
     -> daml.workflow_id: Empty()
     -> daml.transaction_id: Str(122056219af2a73f913e1c2f0ce4422c156bc9cfdb5e5d49baaee0053bf3787f4a97)
     -> daml.offset: Str(000000000000000261)
Events:
SpanEvent #0
     -> Name: canonicalizing transaction tree
     -> Timestamp: 2024-11-06 03:16:43.004809542 +0000 UTC
SpanEvent #1
     -> Name: canonicalized transaction tree
     -> Timestamp: 2024-11-06 03:16:43.005138375 +0000 UTC
SpanEvent #2
     -> Name: converting canonical transaction to domain model
     -> Timestamp: 2024-11-06 03:16:43.005690625 +0000 UTC
SpanEvent #3
     -> Name: converted canonical transaction to domain model
     -> Timestamp: 2024-11-06 03:16:43.006170917 +0000 UTC
SpanEvent #4
     -> Name: released transaction model into batch
     -> Timestamp: 2024-11-06 03:16:43.015018459 +0000 UTC
SpanEvent #5
     -> Name: prepared SQL statements for transaction model
     -> Timestamp: 2024-11-06 03:16:43.015437 +0000 UTC
SpanEvent #6
     -> Name: flushed transaction model SQL to datastore
     -> Timestamp: 2024-11-06 03:16:43.019356042 +0000 UTC
SpanEvent #7
     -> Name: advanced datastore watermark
     -> Timestamp: 2024-11-06 03:16:43.024570334 +0000 UTC
     -> Attributes::
          -> index: Int(384)
          -> offset: Str(000000000000000261)
Links:
SpanLink #0
     -> Trace ID: 839da768a12333920b709410fb73911a
     -> ID: 276627b6e10f62c5
     -> TraceState:
     -> Attributes::
          -> target: Str(↥ ledger submission)
SpanLink #1
     -> Trace ID: 76c58361d46c08761c37ef5821e8fb78
     -> ID: 6051f05f10af0399
     -> TraceState:
     -> Attributes::
          -> target: Str(↧ persist to datastore)
SpanLink #2
     -> Trace ID: 71e67e2420deeef36ef3efacea6399dc
     -> ID: 161b5911e7a0ec18
     -> TraceState:
     -> Attributes::
          -> target: Str(↧ advance watermark)
image
Span #115
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      :
ID             : 81f5f42361aa93ee
Name           : execute datastore transaction
Kind           : Internal
Start time     : 2024-11-06 03:16:43.015931 +0000 UTC
End time       : 2024-11-06 03:16:43.020991 +0000 UTC
Status code    : Unset
Status message :

Span #111
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      : 81f5f42361aa93ee
ID             : 4bf8484e99999c64
Name           : acquire connection
Kind           : Internal
Start time     : 2024-11-06 03:16:43.016475 +0000 UTC
End time       : 2024-11-06 03:16:43.016688 +0000 UTC
Status code    : Unset
Status message :

Span #113
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      : 81f5f42361aa93ee
ID             : 6051f05f10af0399
Name           : execute batch
Kind           : Internal
Start time     : 2024-11-06 03:16:43.016828 +0000 UTC
End time       : 2024-11-06 03:16:43.019494 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> scribe.batch.models_count: Int(37)
Links:
SpanLink #0
     -> Trace ID: 33736b299a690b885c2314b9b17bde05
     -> ID: aba3d1dd6024ff71
     -> TraceState:
     -> Attributes::
          -> offset: Str(00000000000000025c)
          -> target: Str(↥ incoming transaction)
SpanLink #1
     -> Trace ID: 17f3edce9565defd379bf3ab8243f86d
     -> ID: 076afe5b4aac1212
     -> TraceState:
     -> Attributes::
          -> offset: Str(00000000000000025d)
          -> target: Str(↥ incoming transaction)
SpanLink #2
     -> Trace ID: 646ae61de95731c7726a6caee2d69ee9
     -> ID: bca9f5c28de74c90
     -> TraceState:
     -> Attributes::
          -> offset: Str(00000000000000025e)
          -> target: Str(↥ incoming transaction)
SpanLink #3
     -> Trace ID: 9ebd4d4f288b8b338f4192c0d7ea1b8c
     -> ID: a1d145fa9d76d5b3
     -> TraceState:
     -> Attributes::
          -> offset: Str(00000000000000025f)
          -> target: Str(↥ incoming transaction)
SpanLink #4
     -> Trace ID: e0716f968b5019a450da04317ea8f776
     -> ID: a75658ce89441bee
     -> TraceState:
     -> Attributes::
          -> offset: Str(000000000000000260)
          -> target: Str(↥ incoming transaction)
SpanLink #5
     -> Trace ID: 042ce1ffa24b34b38472933ac8209d54
     -> ID: 9d60e1f4c42dce76
     -> TraceState:
     -> Attributes::
          -> offset: Str(000000000000000261)
          -> target: Str(↥ incoming transaction)

Span #112
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      : 6051f05f10af0399
ID             : 00419239933510fa
Name           : execute SQL
Kind           : Internal
Start time     : 2024-11-06 03:16:43.016855 +0000 UTC
End time       : 2024-11-06 03:16:43.019162 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> scribe.__contracts.rows_count: Int(9)
     -> scribe.__exercises.rows_count: Int(3)
     -> scribe.__events.rows_count: Int(12)
     -> scribe.__archives.rows_count: Int(1)
     -> scribe.__transactions.rows_count: Int(6)

Span #114
Trace ID       : 76c58361d46c08761c37ef5821e8fb78
Parent ID      : 81f5f42361aa93ee
ID             : 9872ff55adc9e370
Name           : commit transaction
Kind           : Internal
Start time     : 2024-11-06 03:16:43.019916 +0000 UTC
End time       : 2024-11-06 03:16:43.020742 +0000 UTC
Status code    : Unset
Status message :
image
Span #124
Trace ID       : 71e67e2420deeef36ef3efacea6399dc
Parent ID      :
ID             : 161b5911e7a0ec18
Name           : advance datastore watermark
Kind           : Internal
Start time     : 2024-11-06 03:16:43.021507 +0000 UTC
End time       : 2024-11-06 03:16:43.024872 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> scribe.watermark.offset: Str(000000000000000261)
     -> scribe.watermark.ix: Int(384)
Links:
SpanLink #0
     -> Trace ID: 76c58361d46c08761c37ef5821e8fb78
     -> ID: 6051f05f10af0399
     -> TraceState:
     -> Attributes::
          -> target: Str(↥ persist to datastore)

Span #116
Trace ID       : 71e67e2420deeef36ef3efacea6399dc
Parent ID      : 161b5911e7a0ec18
ID             : 33ab3918ebfe138d
Name           : acquire connection
Kind           : Internal
Start time     : 2024-11-06 03:16:43.022009 +0000 UTC
End time       : 2024-11-06 03:16:43.022222 +0000 UTC
Status code    : Unset
Status message :

Span #6
Trace ID       : 71e67e2420deeef36ef3efacea6399dc
Parent ID      : 161b5911e7a0ec18
ID             : 1a66240dfd597654
Name           : UPDATE scribe.__watermark
Kind           : Client
Start time     : 2024-11-06 03:16:43.022737084 +0000 UTC
End time       : 2024-11-06 03:16:43.023134917 +0000 UTC
Status code    : Unset
Status message :
Attributes:
     -> db.operation: Str(UPDATE)
     -> db.sql.table: Str(__watermark)
     -> db.name: Str(scribe)
     -> db.connection_string: Str(postgresql://postgres-scribe:5432)
     -> server.address: Str(postgres-scribe)
     -> server.port: Int(5432)
     -> db.user: Str(pguser)
     -> db.statement: Str(update __watermark set "offset" = ?, ix = ?;)
     -> db.system: Str(postgresql)

Span #117
Trace ID       : 71e67e2420deeef36ef3efacea6399dc
Parent ID      : 161b5911e7a0ec18
ID             : f0b24fb074fe41f8
Name           : commit transaction
Kind           : Internal
Start time     : 2024-11-06 03:16:43.023629 +0000 UTC
End time       : 2024-11-06 03:16:43.024157 +0000 UTC
Status code    : Unset
Status message :

Trace context propagation

PQS is an intermediary between a ledger instance and downstream applications that would prefer to access data through SQL rather than in streaming manner from Ledger API directly. Despite forming a pipeline between two data storage systems (Canton and PostgreSQL), PQS stores the original ledger transaction’s trace context (see also open-tracing-ledger-api-client) for the purposes of propagation rather than its own. This allows downstream applications to decide for themselves how they want to connect to the original submission’s trace (as a child span or as a new trace connected through span links).
select "offset",
       (trace_context).trace_parent,
       (trace_context).trace_state
from transactions limit 1;
offset       |                      trace_parent                       |   trace_state
--------------------+---------------------------------------------------------+-----------------
0000000000000000bb | 00-f35923baa38cc520a1fc3aec6771380b-b4cf363cbf5efa6a-01 | foo=bar,baz=qux
Span #85
    Trace ID       : f35923baa38cc520a1fc3aec6771380b
    Parent ID      : d3300bedd4c64511
    ID             : b4cf363cbf5efa6a
    Name           : MessageDispatcher.handle
    Kind           : Internal
    Start time     : 2024-11-05 04:01:40.808 +0000 UTC
    End time       : 2024-11-05 04:01:40.822694083 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> canton.class: Str(com.digitalasset.canton.participant.protocol.EnterpriseMessageDispatcher)
↑↑↑ span context propagated through transaction/tree stream in Ledger API

↓↓↓ following parent's links chain leads us to the root span of original submission
Span #19
    Trace ID       : f35923baa38cc520a1fc3aec6771380b
    Parent ID      :
    ID             : de3aed62b5fb43ce
    Name           : com.daml.ledger.api.v1.CommandService/SubmitAndWaitForTransaction
    Kind           : Server
    Start time     : 2024-11-05 04:01:40.569 +0000 UTC
    End time       : 2024-11-05 04:01:40.866904459 +0000 UTC
    Status code    : Unset
    Status message :
Attributes:
     -> rpc.method: Str(SubmitAndWaitForTransaction)
     -> daml.submitter: Str()
     -> rpc.service: Str(com.daml.ledger.api.v1.CommandService)
     -> net.peer.port: Int(38640)
     -> net.transport: Str(ip_tcp)
     -> daml.workflow_id: Str()
     -> daml.command_id: Str(3498760027)
     -> rpc.system: Str(grpc)
     -> net.peer.ip: Str(172.18.0.15)
     -> daml.application_id: Str(appid)
     -> rpc.grpc.status_code: Int(0)
Accessing data stored in PQS’ transactions.trace_context column allows any application to re-create the propagated trace context4 and use it with their runtime’s instrumentation library.

Diagnostics

PQS is capable of exporting diagnostic telemetry snapshots. This data export archive contains essential troubleshooting information such as:
  • application thread dumps (over a period of time)
  • application metrics (over a period of time)
Getting this archive is as easy as accessing the socket with netcat tool:
$ nc localhost 9091 > health-dump.zip
$ unzip health-dump.zip
Archive:  health-dump.zip
  inflating: metrics.openmetrics
  inflating: threads-20250307-105606.zip
The table below lists the available configuration sources with priority decreasing from left to right:
System propertyEnvironment variableDefault valueDescription
da.diagnostics.enabledDA_DIAGNOSTICS_ENABLEDtrueEnables/disables diagnostics data collection and exposition
da.diagnostics.hostDA_DIAGNOSTICS_HOST127.0.0.1Hostname or IP address to use for binding the exposition socket
da.diagnostics.portDA_DIAGNOSTICS_PORT0Port to use for binding the exposition socket (0 = random port)
da.diagnostics.dump.pathDA_DIAGNOSTICS_DUMP_PATH<empty>Directory to write to on graceful shutdown (path needs to be an existing writable directory)
da.diagnostics.metrics.intervalDA_DIAGNOSTICS_METRICS_INTERVALPT10SMetrics collection interval in ISO 8601 format
da.diagnostics.metrics.buffer.sizeDA_DIAGNOSTICS_METRICS_BUFFER_SIZE60Quantity of samples to store for each monitored metric (rolling window)
da.diagnostics.metrics.tagsDA_DIAGNOSTICS_METRICS_TAGS<empty>Comma-separated list of additional labels to enrich each metric with during exposition (for example, job=myapp,env=staging,deployed=20250101)
da.diagnostics.threads.intervalDA_DIAGNOSTICS_THREADS_INTERVALPT1MThread dumps collection interval in ISO 8601 format
da.diagnostics.threads.buffer.sizeDA_DIAGNOSTICS_THREADS_BUFFER_SIZE10Quantity of thread dumps to store (rolling window)

Footnotes

  1. https://opentelemetry.io/docs/zero-code/java/agent/configuration/
  2. https://zio.dev/zio-logging/formatting-log-records/#log-format-configuration
  3. https://opentelemetry.io/docs/specs/otel/overview/#links-between-spans
  4. https://www.w3.org/TR/trace-context/