Offset Pagination Problem With Http Storage: Parameters Being Specified More Than Once

by ADMIN 87 views

Introduction

In this article, we will discuss a common issue encountered when using the Pagination functionality of the HTTP storage plugin in Drill, specifically when querying an ODATA source. The problem arises when the $skip and $top parameters are being appended to the URL of each request, resulting in multiple instances of these parameters being specified. This can lead to unexpected behavior and errors, such as the one described below.

Understanding the Issue

To reproduce the issue, we need to create a storage plugin called customers with a specific configuration. The configuration includes the paginator section, which specifies the limit parameter as $top and the offset parameter as $skip. The pageSize is set to 15, and the method is set to OFFSET.

{
  "type": "http",
  "connections": {
    "customers": {
      "url": "https://services.odata.org/V4/Northwind/Northwind.svc/Customers",
      "requireTail": false,
      "method": "GET",
      "dataPath": "value",
      "authType": "none",
      "inputType": "json",
      "xmlDataLevel": 1,
      "postParameterLocation": "QUERY_STRING",
      "verifySSLCert": true,
      "paginator": {
        "limitParam": "$top",
        "offsetParam": "$skip",
        "pageSize": 15,
        "method": "OFFSET"
      }
    }
  },
  "retryDelay": 1000,
  "proxyType": "direct",
  "authMode": "SHARED_USER",
  "enabled": true
}

When we run the query select * from customers.customers;, we see the error message:

org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: Error parsing JSON - Unexpected character ('<' (code 60)): expected a valid value (JSON String, Number (or 'NaN'/'+INF'/'-INF'), Array, Object or token 'null', 'true' or 'false')
 at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 2]

Syntax error
Connection: customers
Plugin: customers
URL: https://services.odata.org/V4/Northwind/Northwind.svc/Customers?%24skip=0&%24top=15&%24skip=15&%24top=15
Fragment: 0:0

Analyzing the Error

The error message indicates that there is an unexpected character (<) in the JSON response. This is likely due to the multiple instances of the $skip and $top parameters being specified in the URL.

Solution

To resolve this issue, we need to modify the storage plugin configuration to prevent the multiple instances of the $skip and $top parameters from being specified. One possible solution is to set the paginator section to use the LIMIT method instead of OFFSET.

{
  "type": "http",
  "connections": {
    "customers": {
      "url": "https://services.odata.org/V4/Northwind/Northwind.svc/Customers",
      "requireTail": false,
      "method": "GET",
      "dataPath": "value",
      "authType": "none",
      "inputType": "json",
      "xmlDataLevel": 1,
      "postParameterLocation": "QUERY_STRING",
      "verifySSLCert": true,
      "paginator": {
        "limitParam": "$top",
        "offsetParam": "$skip",
        "pageSize": 15,
        "method": "LIMIT"
      }
    }
  },
  "retryDelay": 1000,
  "proxyType": "direct",
  "authMode": "SHARED_USER",
  "enabled": true
}

By making this change, we can prevent the multiple instances of the $skip and $top parameters from being specified, and the query should execute successfully.

Conclusion

Q: What is the offset pagination problem with HTTP storage?

A: The offset pagination problem with HTTP storage occurs when the $skip and $top parameters are being appended to the URL of each request, resulting in multiple instances of these parameters being specified. This can lead to unexpected behavior and errors.

Q: What is the cause of the offset pagination problem?

A: The cause of the offset pagination problem is due to the way the HTTP storage plugin handles pagination. When the paginator section is configured to use the OFFSET method, the plugin appends the $skip and $top parameters to the URL of each request. If the plugin is not properly configured, multiple instances of these parameters can be specified, leading to errors.

Q: How can I reproduce the offset pagination problem?

A: To reproduce the offset pagination problem, you can create a storage plugin called customers with a specific configuration. The configuration includes the paginator section, which specifies the limit parameter as $top and the offset parameter as $skip. The pageSize is set to 15, and the method is set to OFFSET.

{
  "type": "http",
  "connections": {
    "customers": {
      "url": "https://services.odata.org/V4/Northwind/Northwind.svc/Customers",
      "requireTail": false,
      "method": "GET",
      "dataPath": "value",
      "authType": "none",
      "inputType": "json",
      "xmlDataLevel": 1,
      "postParameterLocation": "QUERY_STRING",
      "verifySSLCert": true,
      "paginator": {
        "limitParam": "$top",
        "offsetParam": "$skip",
        "pageSize": 15,
        "method": "OFFSET"
      }
    }
  },
  "retryDelay": 1000,
  "proxyType": "direct",
  "authMode": "SHARED_USER",
  "enabled": true
}

Q: What is the error message I see when I run the query?

A: The error message you see when you run the query is:

org.apache.drill.common.exceptions.UserRemoteException: DATA_READ ERROR: Error parsing JSON - Unexpected character ('<' (code 60)): expected a valid value (JSON String, Number (or 'NaN'/'+INF'/'-INF'), Array, Object or token 'null', 'true' or 'false')
 at [Source: REDACTED (`StreamReadFeature.INCLUDE_SOURCE_IN_LOCATION` disabled); line: 1, column: 2]

Syntax error
Connection: customers
Plugin: customers
URL: https://services.odata.org/V4/Northwind/Northwind.svc/Customers?%24skip=0&%24top=15&%24skip=15&%24top=15
Fragment: 0:0

Q: How can I resolve the offset pagination problem?

A: To resolve the offset pagination problem, you can modify the storage plugin configuration to use the LIMIT method instead of OFFSET. This will prevent the multiple instances of the $skip and $top parameters from being specified.

{
  "type": "http",
  "connections": {
    "customers": {
      "url": "https://services.odata.org/V4/Northwind/Northwind.svc/Customers",
      "requireTail": false,
      "method": "GET",
      "dataPath": "value",
      "authType": "none",
      "inputType": "json",
      "xmlDataLevel": 1,
      "postParameterLocation": "QUERY_STRING",
      "verifySSLCert": true,
      "paginator": {
        "limitParam": "$top",
        "offsetParam": "$skip",
        "pageSize": 15,
        "method": "LIMIT"
      }
    }
  },
  "retryDelay": 1000,
  "proxyType": "direct",
  "authMode": "SHARED_USER",
  "enabled": true
}

Q: What are the benefits of using the LIMIT method instead of OFFSET?

A: Using the LIMIT method instead of OFFSET has several benefits, including:

  • Preventing multiple instances of the $skip and $top parameters from being specified
  • Improving query performance by reducing the number of requests made to the server
  • Reducing the risk of errors caused by incorrect pagination configuration

Q: Can I use the LIMIT method with other storage plugins?

A: Yes, you can use the LIMIT method with other storage plugins. However, you should consult the documentation for the specific storage plugin you are using to determine the correct configuration options.

Q: How can I troubleshoot the offset pagination problem?

A: To troubleshoot the offset pagination problem, you can:

  • Check the storage plugin configuration to ensure that it is correctly configured
  • Verify that the $skip and $top parameters are not being specified multiple times
  • Use the LIMIT method instead of OFFSET to prevent multiple instances of the $skip and $top parameters from being specified

By following these steps, you can troubleshoot and resolve the offset pagination problem with HTTP storage.