
NewsDevOps
Hybrid Cloud Data at Uber: How Engineers Solved Extreme-Scale Replication Challenges
via InfoQLeela Kumili
Uber’s HiveSync team optimized Hadoop Distcp to handle multi-petabyte replication across hybrid cloud and on-premise data lakes. Enhancements include task parallelization, Uber jobs for small transfers, and improved observability, enabling 5x replication capacity and seamless on-premise-to-cloud migration. By Leela Kumili
Continue reading on InfoQ
Opens in a new tab
5 views


