[Bug]: Unable To Know Of The Use Of Deepdoc When Creating Dataset Via API

Mar 12, 2025 by ADMIN 74 views

**Bug Report: Unable to Determine Use of DeepDoc when Creating Dataset via API**

Introduction

As a user of the RAGFlow platform, we have encountered an issue when creating a dataset via API and attempting to visualize its configuration via the user interface (UI). The problem arises when the chunking method and parser are not filled, leading to confusion about the parser being used. In this report, we will outline the steps to reproduce the issue, provide additional information, and discuss the expected behavior.

Self Checks

Before submitting this report, we have performed the following self-checks to ensure that we are using the correct template and following the guidelines:

We have searched for existing issues, including closed ones, on the RAGFlow GitHub repository.
We confirm that we are using English to submit this report, as per the Language Policy.
We have not modified the template and filled in all the required fields.

RAGFlow Workspace Code Commit ID

Unfortunately, we do not have the RAGFlow workspace code commit ID available at this time.

RAGFlow Image Version

We are using RAGFlow image version v0.17.1.

Other Environment Information

We are running RAGFlow connected with Ollama via Docker containers, and all the components are deployed on an AWS virtual machine.

Actual Behavior

When creating a dataset via API and attempting to visualize its configuration via the UI, the chunking method and parser are not filled. This leads to confusion about the parser being used, as it is not clear whether DeepDoc is being used or not.

Expected Behavior

When creating a dataset via API and visualizing its configuration via the UI, we expect the parser to be clearly identified as DeepDoc, and the chunk configuration to be displayed as the one chosen in the parser configuration when creating the dataset.

Steps to Reproduce

To reproduce this issue, follow these steps:

Create a knowledge base (KB) using the RAGFlow API.
Go to the UI and click on the configuration of the created KB.

Additional Information

Unfortunately, we do not have any additional information to provide at this time.

Conclusion

In conclusion, we have reported an issue with the RAGFlow platform when creating a dataset via API and attempting to visualize its configuration via the UI. The problem arises when the chunking method and parser are not filled, leading to confusion about the parser being used. We expect the parser to be clearly identified as DeepDoc, and the chunk configuration to be displayed as the one chosen in the parser configuration when creating the dataset.

Recommendations

To resolve this issue, we recommend the following:

Investigate the cause of the issue and identify the root cause.
Update the RAGFlow API to clearly display the parser being used when creating a dataset.
Update the UI to display the chunk configuration as the one chosen in the parser configuration when creating the dataset.

Related Issues

We have searched for existing issues related to this problem and have not found any similar reports.

Attachments

Unfortunately, we do not have any attachments to provide at this time.

Related Projects

We are using RAGFlow connected with Ollama via Docker containers, and all the components are deployed on an AWS virtual machine.

License

This report is licensed under the MIT License.

Acknowledgments

We would like to acknowledge the RAGFlow team for their efforts in developing and maintaining the platform.

References

We have not included any references in this report.

Appendix

Introduction

In our previous article, we reported an issue with the RAGFlow platform when creating a dataset via API and attempting to visualize its configuration via the UI. The problem arises when the chunking method and parser are not filled, leading to confusion about the parser being used. In this Q&A article, we will provide answers to frequently asked questions related to this issue.

Q: What is the cause of the issue?

A: The cause of the issue is not yet clear. However, it is believed to be related to the way the RAGFlow API handles the creation of datasets and the display of parser configurations in the UI.

Q: How can I reproduce the issue?

A: To reproduce the issue, follow these steps:

Create a knowledge base (KB) using the RAGFlow API.
Go to the UI and click on the configuration of the created KB.

Q: What is the expected behavior?

A: When creating a dataset via API and visualizing its configuration via the UI, we expect the parser to be clearly identified as DeepDoc, and the chunk configuration to be displayed as the one chosen in the parser configuration when creating the dataset.

Q: How can I determine if DeepDoc is being used?

A: Unfortunately, it is not currently possible to determine if DeepDoc is being used when creating a dataset via API and visualizing its configuration via the UI.

Q: What are the implications of this issue?

A: The implications of this issue are that users may not be able to accurately determine the parser being used when creating a dataset via API and visualizing its configuration via the UI. This may lead to confusion and errors in the use of the RAGFlow platform.

Q: How can I get help with this issue?

A: If you are experiencing issues with the RAGFlow platform, including the inability to determine the use of DeepDoc when creating a dataset via API, please contact the RAGFlow support team for assistance.

Q: What is the current status of the issue?

A: The current status of the issue is that it is being investigated by the RAGFlow development team. We will provide updates on the status of the issue as more information becomes available.

Q: When can I expect a resolution to the issue?

A: We cannot provide a specific timeline for the resolution of the issue. However, we will work to resolve the issue as quickly as possible and provide updates on the status of the issue.

Q: How can I stay up-to-date with the latest information on the issue?

A: To stay up-to-date with the latest information on the issue, please follow the RAGFlow blog and social media channels for updates.

Q: What are the next steps for resolving the issue?

A: The next steps for resolving the issue will be determined by the RAGFlow development team. We will provide updates on the status of the issue as more information becomes available.

Q: Can I contribute to the resolution of the issue?

A: Yes, if you have any information or insights that may be helpful in resolving the issue, please contact the RAGFlow support team. We appreciate any contributions that can help us resolve the issue as quickly as possible.