Parallel OpenTelemetry CLI

watching requests run faster

Nix
Bash
Opentelemetry

Hello again! In the last post, we had the following two part script running:

#!/usr/bin/env nix-shell
#!nix-shell --pure -i bash -p bash parallel curl jq cacert
#!nix-shell -I nixpkgs=https://github.com/NixOS/nixpkgs/archive/ee084c02040e864eeeb4cf4f8538d92f7c675671.tar.gz

generate_files() {
  rm -f ./urls.txt
  count=10
  for i in $(seq "$count")
  do
    x=$(shuf -i 1-10 -n 1)
    printf "https://httpbingo.org/delay/%s\n" "$x" >> urls.txt
  done
}

send_requests() {
  curl -fsSL "$1" | jq .data
}
export -f send_requests

generate_files
# Use 10 workers (child processes) to send requests across
parallel -j 10 'send_requests {}' :::: urls.txt

In summary, we GET 10 urls that could be slow, and request them in parallel. We can exercise this script ourselves, and get an idea of it's performance independently.

For the next hypothetical, lets assume this script is called as part of a larger system. A challenging idea is being able to observe code action across an entire system, and to understand performance of a component within a larger system. In other programming languages, there is a movement to add instrumentation to code to help in understanding these issues. This is acheived with language SDKs and instrumentation being placed in to commonly used and critical libraries.

For some of us though, our language of choice may not have those sdk's in place (like shell code). Enter otel-cli!

With otel-cli, we can insert telemetry into our shell processes, and get information from code paths we haven't reached yet. We don't send all traces to collectors by default, but can use collectors to help sample before ingestion to an observability platform. This has some really interesting ramifications: We can instrument build processes, testing, data transformation pipelines, the list goes on!

So, lets talk instrumenting this script. Let's say our code had a change that reduced the child processes in our parallel request send to 2, and our service was unable to submit requests in a timely fashion. That would look like:

parallel -j 2 'send_requests {}' :::: urls.txt

Which, depending on the delays you generate, roughly triples the processing time for this script to complete.

So, let's set up Jaeger to observe this! I used the following docker-compose from the otel-cli repo:

---
version: '2.1'
services:
  jaeger:
    image: jaegertracing/all-in-one:1.22.0
    ports:
      - "16686:16686"
      - "14268"
      - "14250"
  otel-collector:
    image: otel/opentelemetry-collector:0.24.0
    volumes:
      - ./otel-collector.yaml:/local.yaml
    command: --config /local.yaml
    ports:
      - "4317:4317"
    depends_on:
      - jaeger

The config for otel-collector looks like:

---
receivers:
  otlp:
    protocols:
      grpc:

exporters:
  jaeger:
    endpoint: "jaeger:14250"
    insecure: true

processors:
  batch:

service:
  pipelines:
    traces:
      receivers:
        - otlp
      processors:
        - batch
      exporters:
        - jaeger

And then we start up our local observability platform:

docker-compose up -d

Now we have a place to visualize traces from our local script!

To be continued...