Because of the random access pattern in a large data set, caching is going to be affected. Once you get off of a local disc, you can run into problems, he said. They built a storage system that meets iOS requirements for the entire cluster without independent caching. The system supports roughly 150,000 file rates per second per 100 terabytes of storage, he said, and they can scale the iOS at the same time as storage. “We’re going to shard the datasets between the different ranks so that we don’t get hotspots even if we use relatively small amounts of data,” Mansell said.