Bug: OTLP Responses Should Match Content-type Of Requests

by ADMIN 58 views

What is the bug?

The OTLP (OpenTelemetry Protocol) specification outlines the requirements for server responses, including the setting of the Content-Type header based on the response body format. Specifically, the server must set the Content-Type header to application/x-protobuf if the response body is binary-encoded Protobuf payload, and application/json if the response is JSON-encoded Protobuf payload. Furthermore, the server must use the same Content-Type in the response as it received in the request.

However, the current implementation does not adhere to these requirements. Regardless of the request format, the response format does not change, and a blank response is returned for success, while Content-Type: application/octet-stream and a binary-encoded proto are returned for errors.

How to reproduce it?

To reproduce this issue, you can send a request to the Mimir endpoint with a JSON payload, as shown below:

curl -i -XPOST -H 'Content-Type: application/json' <mimir endpoint> -d '{
  "resourceMetrics": [
    {
      "scopeMetrics": [
        {
          "metrics": [
            {
              "name": "test_metric",
              "unit": "s",
              "description": "",
              "gauge": {
                "dataPoints": [
                  {
                    "asInt": 1,
                    "timeUnixNano": 1741602284008000000,
                    "attributes": [
                      {
                        "key": "bar_label",
                        "value": {
                          "stringValue": "abc"
                        }
                      }
                    ]
                  }
                ]
              }
            }
          ]
        }
      ]
    }
  ]
}'

This request should return a 200 success response without a Content-Type header. Instead, a blank response should be returned with a Content-Type header set to application/json.

What did you think would happen?

Based on the OTLP specification, I expected the response to have a Content-Type header set to application/json and a body containing an empty JSON object {}.

What was your environment?

The issue can be seen in the current code, which is available at the commit https://github.com/grafana/mimir/commit/20548c3829fde9e7736d1f65ca428f3ea76d3530.

Any additional context to share?

No additional context is available to share.

Impact of the bug

The impact of this bug is significant, as it affects the correctness of the OTLP responses. The bug can lead to incorrect interpretation of the response data, which can have serious consequences in production environments.

Solution to the bug

To fix this bug, the implementation must be modified to set the Content-Type header based on the response body format. Specifically, the Content-Type header should be set to application/x-protobuf if the response body is binary-encoded Protobuf payload, and application/json if the response is JSON-encoded Protobuf payload. Additionally, the response format should be changed based on the request format.

Code changes

The code changes required to fix this bug are as follows:

  • In the otel.go file, modify the HandleOTLPRequest function to set the Content-Type header based on the response body format.
  • In the otel.go file, modify the HandleOTLPRequest function to change the response format based on the request format.

Example code

Here is an example of the modified code:

func HandleOTLPRequest(w http.ResponseWriter, r *http.Request) {
    // ...
    if r.Header.Get("Content-Type") == "application/json" {
        w.Header().Set("Content-Type", "application/json")
        // ...
    } else if r.Header.Get("Content-Type") == "application/x-protobuf" {
        w.Header().Set("Content-Type", "application/x-protobuf")
        // ...
    }
    // ...
}

Testing the fix

To test the fix, you can send a request to the Mimir endpoint with a JSON payload, as shown below:

curl -i -XPOST -H 'Content-Type: application/json' <mimir endpoint> -d '{
  "resourceMetrics": [
    {
      "scopeMetrics": [
        {
          "metrics": [
            {
              "name": "test_metric",
              "unit": "s",
              "description": "",
              "gauge": {
                "dataPoints": [
                  {
                    "asInt": 1,
                    "timeUnixNano": 1741602284008000000,
                    "attributes": [
                      {
                        "key": "bar_label",
                        "value": {
                          "stringValue": "abc"
                        }
                      }
                    ]
                  }
                ]
              }
            }
          ]
        }
      ]
    }
  ]
}'

This request should return a 200 success response with a Content-Type header set to application/json and a body containing an empty JSON object {}.

Conclusion

Q&A

Q: What is the bug in the OTLP responses?

A: The bug in the OTLP responses is that the Content-Type header is not being set correctly based on the response body format. Specifically, the Content-Type header should be set to application/x-protobuf if the response body is binary-encoded Protobuf payload, and application/json if the response is JSON-encoded Protobuf payload.

Q: What is the impact of this bug?

A: The impact of this bug is significant, as it affects the correctness of the OTLP responses. The bug can lead to incorrect interpretation of the response data, which can have serious consequences in production environments.

Q: How can I reproduce this bug?

A: To reproduce this bug, you can send a request to the Mimir endpoint with a JSON payload, as shown below:

curl -i -XPOST -H 'Content-Type: application/json' <mimir endpoint> -d '{
  "resourceMetrics": [
    {
      "scopeMetrics": [
        {
          "metrics": [
            {
              "name": "test_metric",
              "unit": "s",
              "description": "",
              "gauge": {
                "dataPoints": [
                  {
                    "asInt": 1,
                    "timeUnixNano": 1741602284008000000,
                    "attributes": [
                      {
                        "key": "bar_label",
                        "value": {
                          "stringValue": "abc"
                        }
                      }
                    ]
                  }
                ]
              }
            }
          ]
        }
      ]
    }
  ]
}'

This request should return a 200 success response without a Content-Type header. Instead, a blank response should be returned with a Content-Type header set to application/octet-stream.

Q: What did you think would happen?

A: Based on the OTLP specification, I expected the response to have a Content-Type header set to application/json and a body containing an empty JSON object {}.

Q: What was your environment?

A: The issue can be seen in the current code, which is available at the commit https://github.com/grafana/mimir/commit/20548c3829fde9e7736d1f65ca428f3ea76d3530.

Q: Any additional context to share?

A: No additional context is available to share.

Q: How can I fix this bug?

A: To fix this bug, the implementation must be modified to set the Content-Type header based on the response body format. Specifically, the Content-Type header should be set to application/x-protobuf if the response body is binary-encoded Protobuf payload, and application/json if the response is JSON-encoded Protobuf payload. Additionally, the response format should be changed based on the request format.

Q: What are the code changes required to fix this bug?

A: The code changes required to fix this bug are as follows:

  • In the otel.go file, modify the HandleOTLPRequest function to set the Content-Type header based on the response body format.
  • In the otel.go file, modify the HandleOTLPRequest function to change the response format based on the request format.

Q: Can you provide an example of the modified code?

A: Here is an example of the modified code:

func HandleOTLPRequest(w http.ResponseWriter, r *http.Request) {
    // ...
    if r.Header.Get("Content-Type") == "application/json" {
        w.Header().Set("Content-Type", "application/json")
        // ...
    } else if r.Header.Get("Content-Type") == "application/x-protobuf" {
        w.Header().Set("Content-Type", "application/x-protobuf")
        // ...
    }
    // ...
}

Q: How can I test the fix?

A: To test the fix, you can send a request to the Mimir endpoint with a JSON payload, as shown below:

curl -i -XPOST -H 'Content-Type: application/json' <mimir endpoint> -d '{
  "resourceMetrics": [
    {
      "scopeMetrics": [
        {
          "metrics": [
            {
              "name": "test_metric",
              "unit": "s",
              "description": "",
              "gauge": {
                "dataPoints": [
                  {
                    "asInt": 1,
                    "timeUnixNano": 1741602284008000000,
                    "attributes": [
                      {
                        "key": "bar_label",
                        "value": {
                          "stringValue": "abc"
                        }
                      }
                    ]
                  }
                ]
              }
            }
          ]
        }
      ]
    }
  ]
}'

This request should return a 200 success response with a Content-Type header set to application/json and a body containing an empty JSON object {}.

Q: What is the conclusion?

A: In conclusion, the bug in the OTLP responses is significant and affects the correctness of the responses. The fix involves modifying the implementation to set the Content-Type header based on the response body format and changing the response format based on the request format. The modified code has been provided, and testing the fix has been demonstrated.