Duplicate Logs For FastAPI Python Applications After Bumping DD Agent Form 7.59.0 To 7.61.0

by ADMIN 92 views

Duplicate Logs for FastAPI Python Applications after Bumping DD Agent from 7.59.0 to 7.61.0

Introduction

In this article, we will discuss the issue of duplicate logs appearing in Datadog's APM endpoint plots after upgrading the Datadog agent from 7.59.0 to 7.61.0 in a FastAPI Python micro-service deployed in a Kubernetes (k8s) cluster. We will explore the possible causes of this issue and provide a solution to resolve it.

Background

Our organization has a few FastAPI Python micro-services deployed in a k8s cluster, all of which are instrumented with dd-trace-py using the ddtrace-run CLI to prefacing the unicorn server command. We use the Datadog helm chart to manage the Datadog agents within our cluster. The Datadog helm chart is a widely used and well-maintained chart that simplifies the deployment and management of Datadog agents in a k8s cluster.

The Issue

When we bumped the helm chart from 3.87.0 to 3.88.0, thereby upgrading the agent from 7.59.0 to 7.61.0, we noticed that the same logs appear twice in the APM's endpoint plots, under different paths even though they are the same calls. This issue is not only confusing but also makes it difficult to analyze and troubleshoot the performance of our micro-services.

Possible Causes

After investigating the issue, we found that the possible causes of duplicate logs are:

  • Agent configuration changes: The upgrade from 7.59.0 to 7.61.0 may have introduced changes to the agent configuration that are causing the duplicate logs.
  • Service configuration changes: The upgrade may have also introduced changes to the service configuration that are causing the duplicate logs.
  • Datadog agent version incompatibility: There may be an incompatibility between the Datadog agent version 7.61.0 and the dd-trace-py version that is causing the duplicate logs.

Investigation

To investigate the issue further, we collected the following information:

  • Agent logs: We collected the agent logs to see if there are any errors or warnings that may indicate the cause of the issue.
  • Service logs: We collected the service logs to see if there are any errors or warnings that may indicate the cause of the issue.
  • Datadog APM logs: We collected the Datadog APM logs to see if there are any errors or warnings that may indicate the cause of the issue.

Solution

After investigating the issue, we found that the solution is to downgrade the Datadog agent from 7.61.0 to 7.59.0. This will resolve the issue of duplicate logs and restore the normal behavior of the Datadog agent.

Conclusion

In conclusion, the issue of duplicate logs appearing in Datadog's APM endpoint plots after upgrading the Datadog agent from 7.59.0 to 7.61.0 in a FastAPI Python micro-service deployed in a k8s cluster is a complex issue that requires a thorough investigation. The possible causes of the issue are agent configuration changes, service configuration changes, and Datadog agent version incompatibility. The solution to the issue is to downgrade the Datadog agent from 7.61.0 to 7.59.0.

Recommendations

Based on our experience with this issue, we recommend the following:

  • Monitor agent logs: Monitor the agent logs to detect any errors or warnings that may indicate the cause of the issue.
  • Monitor service logs: Monitor the service logs to detect any errors or warnings that may indicate the cause of the issue.
  • Monitor Datadog APM logs: Monitor the Datadog APM logs to detect any errors or warnings that may indicate the cause of the issue.
  • Test agent upgrades: Test agent upgrades in a non-production environment to detect any issues before upgrading the agent in production.

Future Work

In the future, we plan to investigate the following:

  • Agent configuration changes: Investigate the agent configuration changes that may have caused the duplicate logs.
  • Service configuration changes: Investigate the service configuration changes that may have caused the duplicate logs.
  • Datadog agent version incompatibility: Investigate the Datadog agent version incompatibility that may have caused the duplicate logs.

References

Introduction

In our previous article, we discussed the issue of duplicate logs appearing in Datadog's APM endpoint plots after upgrading the Datadog agent from 7.59.0 to 7.61.0 in a FastAPI Python micro-service deployed in a Kubernetes (k8s) cluster. We also provided a solution to resolve the issue by downgrading the Datadog agent from 7.61.0 to 7.59.0. In this article, we will provide a Q&A section to address some of the common questions related to this issue.

Q&A

Q: What is the cause of duplicate logs in Datadog's APM endpoint plots?

A: The cause of duplicate logs in Datadog's APM endpoint plots is not yet fully understood. However, it is believed to be related to changes in the Datadog agent configuration or service configuration that occurred during the upgrade from 7.59.0 to 7.61.0.

Q: How can I prevent duplicate logs from appearing in Datadog's APM endpoint plots?

A: To prevent duplicate logs from appearing in Datadog's APM endpoint plots, you can try the following:

  • Monitor agent logs to detect any errors or warnings that may indicate the cause of the issue.
  • Monitor service logs to detect any errors or warnings that may indicate the cause of the issue.
  • Monitor Datadog APM logs to detect any errors or warnings that may indicate the cause of the issue.
  • Test agent upgrades in a non-production environment to detect any issues before upgrading the agent in production.

Q: What are the possible causes of duplicate logs in Datadog's APM endpoint plots?

A: The possible causes of duplicate logs in Datadog's APM endpoint plots are:

  • Agent configuration changes
  • Service configuration changes
  • Datadog agent version incompatibility

Q: How can I troubleshoot duplicate logs in Datadog's APM endpoint plots?

A: To troubleshoot duplicate logs in Datadog's APM endpoint plots, you can try the following:

  • Collect agent logs to see if there are any errors or warnings that may indicate the cause of the issue.
  • Collect service logs to see if there are any errors or warnings that may indicate the cause of the issue.
  • Collect Datadog APM logs to see if there are any errors or warnings that may indicate the cause of the issue.

Q: What is the solution to the issue of duplicate logs in Datadog's APM endpoint plots?

A: The solution to the issue of duplicate logs in Datadog's APM endpoint plots is to downgrade the Datadog agent from 7.61.0 to 7.59.0.

Q: Can I upgrade the Datadog agent to a newer version without experiencing duplicate logs?

A: It is possible to upgrade the Datadog agent to a newer version without experiencing duplicate logs. However, it is recommended to test the upgrade in a non-production environment before upgrading the agent in production.

Conclusion

In conclusion, the issue of duplicate logs appearing in Datadog's APM endpoint plots after upgrading the Datadog agent from 7.59.0 to 7.61.0 in a FastAPI Python micro-service deployed in a k8s cluster is a complex issue that requires a thorough investigation. The possible causes of the issue are agent configuration changes, service configuration changes, and Datadog agent version incompatibility. The solution to the issue is to downgrade the Datadog agent from 7.61.0 to 7.59.0.

Recommendations

Based on our experience with this issue, we recommend the following:

  • Monitor agent logs to detect any errors or warnings that may indicate the cause of the issue.
  • Monitor service logs to detect any errors or warnings that may indicate the cause of the issue.
  • Monitor Datadog APM logs to detect any errors or warnings that may indicate the cause of the issue.
  • Test agent upgrades in a non-production environment to detect any issues before upgrading the agent in production.

Future Work

In the future, we plan to investigate the following:

  • Agent configuration changes
  • Service configuration changes
  • Datadog agent version incompatibility

References