Match Filenames For A PDB Built On Another Machine

by ADMIN 51 views

Introduction

When working with PDB files built on another machine, matching filenames can be a challenging task. In this article, we will explore the current implementation of filename matching and propose an enhancement to make it more flexible and robust.

How it Works Now

The PDB's LINES section associates modules and source filenames with pairs of line numbers and addresses. We use this information to connect position-based // FUNCTION annotations with an address. However, this approach has some limitations.

Assumptions on Windows

If you are running reccmp on Windows, we assume the PDB came from a build on your machine and all the (absolute) paths actually exist. Therefore, it's easy to match them up. This assumption is based on the fact that Windows paths are absolute and can be easily translated to their POSIX equivalent using winepath.

Assumptions on Non-Windows Platforms

If you are not on Windows, we still assume the PDB came from a build on your machine, but using Wine. We depend on winepath to translate the Windows paths to their POSIX equivalent. To avoid calling winepath too often, we start by doing a POSIX-to-Windows conversion of the input source code path. This is either the path provided by the user at the command line or it is from the reccmp-project.yml file. Using that path fragment, we check each Windows path from the PDB to see if it is relative to the translated source path. If it is, we cut out the relative parts, add them to the original POSIX source path, then test to see if the file exists.

The Enhancement

The current implementation has some limitations. It assumes that the PDB came from a build on the same machine and that the paths actually exist. This makes it difficult to download the recomp binary and PDB from another location (i.e. the CI artifacts) and run the analysis on your own machine.

To address this issue, we propose an enhancement that applies the same principle of using a path fragment mapping but without using winepath or assuming that the path must exist (on Windows). This would allow you to download the recomp binary and PDB from another location and run the analysis on your own machine.

Path Fragment Mapping

The key idea behind this enhancement is to use a path fragment mapping to match the filenames. This involves the following steps:

  1. Get the path fragment: Get the path fragment from the input source code path. This can be either the path provided by the user at the command line or it is from the reccmp-project.yml file.
  2. Check relative paths: Check each Windows path from the PDB to see if it is relative to the translated source path. If it is, cut out the relative parts and add them to the original POSIX source path.
  3. Test file existence: Test to see if the file exists. If it does, use the path fragment mapping to match the filenames.

Benefits of the Enhancement

The proposed enhancement has several benefits:

  • Flexibility: It allows you to download the recomp binary and PDB from another location and run the analysis on your own machine.
  • Robustness: It does not assume that the PDB came from a build on the same machine or that the paths actually exist.
  • Efficiency: It avoids calling winepath too often, which can improve performance.

Implementation

To implement this enhancement, we need to modify the reccmp code to use the path fragment mapping. This involves the following steps:

  1. Get the path fragment: Get the path fragment from the input source code path.
  2. Check relative paths: Check each Windows path from the PDB to see if it is relative to the translated source path. If it is, cut out the relative parts and add them to the original POSIX source path.
  3. Test file existence: Test to see if the file exists. If it does, use the path fragment mapping to match the filenames.

Code Changes

The code changes required to implement this enhancement are as follows:

import os

def get_path_fragment(path):
    # Get the path fragment from the input source code path
    return os.path.dirname(path)

def check_relative_paths(path_fragment, windows_paths):
    # Check each Windows path from the PDB to see if it is relative to the translated source path
    for windows_path in windows_paths:
        if os.path.commonpath([path_fragment, windows_path]) == path_fragment:
            # Cut out the relative parts and add them to the original POSIX source path
            relative_path = os.path.relpath(windows_path, path_fragment)
            return relative_path

def test_file_existence(relative_path):
    # Test to see if the file exists
    return os.path.exists(relative_path)

def match_filenames(path_fragment, windows_paths):
    # Use the path fragment mapping to match the filenames
    for windows_path in windows_paths:
        relative_path = check_relative_paths(path_fragment, [windows_path])
        if relative_path:
            return relative_path
    return None

Conclusion

Introduction

In our previous article, we explored the current implementation of filename matching and proposed an enhancement to make it more flexible and robust. In this article, we will answer some frequently asked questions about the proposed enhancement.

Q: What is the main problem with the current implementation?

A: The current implementation assumes that the PDB came from a build on the same machine and that the paths actually exist. This makes it difficult to download the recomp binary and PDB from another location (i.e. the CI artifacts) and run the analysis on your own machine.

Q: How does the proposed enhancement address this problem?

A: The proposed enhancement uses a path fragment mapping to match the filenames. This involves getting the path fragment from the input source code path, checking each Windows path from the PDB to see if it is relative to the translated source path, and testing to see if the file exists.

Q: What are the benefits of the proposed enhancement?

A: The proposed enhancement has several benefits, including:

  • Flexibility: It allows you to download the recomp binary and PDB from another location and run the analysis on your own machine.
  • Robustness: It does not assume that the PDB came from a build on the same machine or that the paths actually exist.
  • Efficiency: It avoids calling winepath too often, which can improve performance.

Q: How does the proposed enhancement work?

A: The proposed enhancement works as follows:

  1. Get the path fragment: Get the path fragment from the input source code path.
  2. Check relative paths: Check each Windows path from the PDB to see if it is relative to the translated source path. If it is, cut out the relative parts and add them to the original POSIX source path.
  3. Test file existence: Test to see if the file exists. If it does, use the path fragment mapping to match the filenames.

Q: What code changes are required to implement the proposed enhancement?

A: The code changes required to implement the proposed enhancement are as follows:

import os

def get_path_fragment(path):
    # Get the path fragment from the input source code path
    return os.path.dirname(path)

def check_relative_paths(path_fragment, windows_paths):
    # Check each Windows path from the PDB to see if it is relative to the translated source path
    for windows_path in windows_paths:
        if os.path.commonpath([path_fragment, windows_path]) == path_fragment:
            # Cut out the relative parts and add them to the original POSIX source path
            relative_path = os.path.relpath(windows_path, path_fragment)
            return relative_path

def test_file_existence(relative_path):
    # Test to see if the file exists
    return os.path.exists(relative_path)

def match_filenames(path_fragment, windows_paths):
    # Use the path fragment mapping to match the filenames
    for windows_path in windows_paths:
        relative_path = check_relative_paths(path_fragment, [windows_path])
        if relative_path:
            return relative_path
    return None

Q: What are the implications of the proposed enhancement?

A: The proposed enhancement has several implications, including:

  • Improved flexibility: It allows you to download the recomp binary and PDB from another location and run the analysis on your own machine.
  • Improved robustness: It does not assume that the PDB came from a build on the same machine or that the paths actually exist.
  • Improved efficiency: It avoids calling winepath too often, which can improve performance.

Conclusion

In this article, we answered some frequently asked questions about the proposed enhancement to the filename matching implementation. The proposed enhancement uses a path fragment mapping to match the filenames, which allows you to download the recomp binary and PDB from another location and run the analysis on your own machine. We also provided the code changes required to implement this enhancement.