We are looking for a highly motivated Senior C++ Storage Engineer to join Voltron Data’s team. On the team, you’ll have the opportunity to help support and grow the Voltron Data and Apache Arrow ecosystems. You will work closely with Voltron Data development teams to implement performant storage and I/O functions targeting a wide variety of networked, cloud, and local storage solutions.
Why work at Voltron Data?
- We are Going for Impact: We are a Series A, venture-backed startup assembling a global team to build a new foundation for data analytics with Apache Arrow. This foundation will usher in a wave of innovation in data processing that can take full advantage of the speed and efficiency offered by modern hardware.
- We are Committed to Bridging Open Source Communities: We are a collection of open source maintainers who have been driving open source ecosystems over the last 15 years, particularly in the C++, Python, and R programming ecosystems.
- We are Building a Diverse, Inclusive Company: We are creating a representative, equitable, and respectful workplace that prioritizes employee growth. Everyone at Voltron Data is bought into the company’s success; all voices are critical to shaping the organization’s future.
Timeline
Below is a rough timeline of where you can expect to be at different points during your career path starting in this position.
Upon joining:
- Spending time learning about the Apache Arrow memory layout, compute primitives, and APIs.
- Familiarizing yourself with the different partners for compute kernels and the query execution engine on Apache Arrow.
- Learning and embracing the Apache development process.
Within a month:
- Implementing new high-performance storage and I/O primitives.
- Benchmarking existing I/O library functions to determine where there are bottlenecks.
- Discovering and implementing optimizations in data reads and writes.
- Participating in peer code review of all PRs related to file storage and interacting with different filesystems.
- Contributing to technical discussions and technical design documents.
Within 6 months:
- Developing a comprehensive set of low level benchmarks for I/O functions targeting various local, networked and cloud storage technologies to enable monitoring for performance regressions.
- Ensuring that all filesystems interactions are compatible and performant across platforms (Linux, MacOS, and Windows).
- Identifying and building reusable software components to ensure a high quality and maintainable codebase.
Within 12 months:
- Analyzing I/O throughput in a massively parallel and distributed query engine to identify inefficiencies and crafting solutions to tackle those inefficiencies.
- Ensuring that the everything related to storage is built as high quality as possible, balancing performance, usability, and maintainability across the Voltron Data and Apache Arrow ecosystems.