Large scale consolidation of distributed systems introduces data sharing between consumers which are not centrally managed, but may be physically adjacent. For example, shared global data sets can be jointly used by different services of the same organization, possibly running on different virtual machines in the same data center. Similarly, neighboring CDNs provide fast access to the same content from the Internet. Cooperative caching, in which data are fetched from a neighboring cache instead of from the disk or from the Internet, can significantly improve resource utilization and performance in such scenarios. However, existing cooperative caching approaches fail to address the selfish nature of cache owners and their conflicting objectives. This calls for a new storage model that explicitly considers the cost of cooperation, and provides a framework for calculating the utility each owner derives from its cache and from cooperating with others. We define such a model, and construct four representative cooperation approaches to demonstrate how (and when) cooperative caching can be successfully employed in such large scale systems. We present principal guidelines for cooperative caching derived from our experimental analysis. We show that choosing the best cooperative approach can decrease the system's I/O delay by as much as 87%, while imposing cooperation when unwarranted might increase it by as much as 92%.