(Go: >> BACK << -|- >> HOME <<)

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP][RFC] Add sparse host buffer source #16252

Draft
wants to merge 2 commits into
base: branch-24.08
Choose a base branch
from

Conversation

rjzamora
Copy link
Member

Related to #15919

DISCLAIMER: This is meant to be a rough RFC to illustrate my idea for a temporary mitigation strategy for NativeFile removal. I am probably not the right person to take the libcudf changes over the line but I still threw some C++ code together for fun :)

Idea: We don't need NativeFile support to get reasonable partial-IO performance from remote storage if we only transfer the necessary byte ranges into host memory with fsspec and libcudf is able to read from the sparse <offset, byte-range> mapping. The fsspec component is covered in #16166, but that PR currently wastes a lot of host-memory by copying the necessary byte ranges into a larger "proxy" byte range that matches the size of the actual file.

@vuule @vyasr - Does a "sparse" host buffer source seem like a reasonable approach for the near term?

@rjzamora rjzamora added 2 - In Progress Currently a work in progress improvement Improvement / enhancement to an existing function labels Jul 11, 2024
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label Jul 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2 - In Progress Currently a work in progress improvement Improvement / enhancement to an existing function libcudf Affects libcudf (C++/CUDA) code.
Projects
Status: In Progress
Development

Successfully merging this pull request may close these issues.

None yet

1 participant