How does threading-profiler (the default mode) work

These blogs skywalking-profiling and skywalking-python-profiling described how the threading-profiler works

And this figure demonstrates how the profiler works as well:

sequenceDiagram
    API->>+working thread: get: /api/v1/user/
    rect rgb(0,200,0)
    API->>+profiling thread: start profiling
    profiling thread->>working thread: snapshot
    profiling thread->>working thread: snapshot
    profiling thread->>working thread: snapshot
    profiling thread->>-working thread: snapshot
    end
    working thread-->>-API: response

It works well with threading mode because the whole process will be executed in the same thread, so the profiling thread can fetch the complete profiling info of the process of the API request.

Why doesn’t threading-profiler work in greenlet mode

When the python program runs with gevent + greenlet, the process would be like this:

sequenceDiagram
    API->>+working thread 1: get: /api/v1/user/
    rect rgb(0,200,0)
    greenlet.HUB-->>+working thread 1: swap in the profiled greenlet
    API->>+profiling thread: start profiling
    profiling thread->>working thread 1: snapshot
    working thread 1-->>-greenlet.HUB : swap out the profiled greenlet
    end
    greenlet.HUB-->>+working thread 1: swap in the other greenlet
    profiling thread->>working thread 1: snapshot
    greenlet.HUB-->>+working thread 2: swap in the profiled greenlet
    profiling thread->>working thread 1: snapshot
    profiling thread->>working thread 1: snapshot
    working thread 2-->-greenlet.HUB : swap out the profiled greenlet
    profiling thread->>working thread 1: snapshot
    profiling thread->>-working thread 1: snapshot
    working thread 1-->>-greenlet.HUB : swap out the other greenlet
    working thread 1-->>-API: response

In this circumstance, the snapshot of the working thread includes multi contexts of different greenlets, which will make skywalking confused to build the trace stack.

Fortunately, greenlet has an API for profiling, the doc is here. We can implement a greenlet profiler to solve this issue.

How the greenlet profiler works

A greenlet profiler leverages the trace callback of greenlet, it works like this:

sequenceDiagram
    API->>+working thread 1: get: /api/v1/user/
    rect rgb(0,200,0)
    greenlet.HUB-->>+working thread 1: swap in the profiled greenlet and snapshot
    working thread 1-->>-greenlet.HUB : swap out the profiled greenlet and snapshot
    end
    greenlet.HUB-->>+working thread 1: swap in the other greenlet
    rect rgb(0,200,0)
    greenlet.HUB-->>+working thread 2: swap in the profiled greenlet and snapshot
    working thread 2-->-greenlet.HUB : swap out the profiled greenlet and snapshot
    end
    working thread 1-->>-greenlet.HUB : swap out the other greenlet
    working thread 1-->>-API: response

We can set a callback function to the greenlet that we need to profiling, then when the greenlet.HUB switches the context in/out to the working thread, the callback will build a snapshot of the greenlet’s traceback and send it to skywalking.

The difference between these two profilers

The greenlet profiler will significantly reduce the snapshot times of the profiling process, which means that it will cost less CPU time than the threading profiler.