Single-celled organisms can anticipate future environmental changes to gain an evolutionary advantage. Better predictions may yield higher fitness, but it is unclear how accurately cells can predict the future, and what limits their predictive capacity. In general, any prediction about the future must be based on the past. Therefore, how much information a cell collects from the past sets a fundamental bound on the amount of predictive information it can obtain. In this work, we first investigate this information bound for different classes of input signal, and identify the optimal system which reaches the bound. We subsequently show that, while biochemical networks exist that can reach the information bound, doing so is exceedingly costly in terms of physical resources such as proteins and energy. Moreover, these networks can increase both the past and predictive information under a resource constraint by moving away from the bound. The reason is that not all bits of past information are equally predictive, nor costly. Consequently, a trade-off arises between the most predictive, and the cheapest bits of past information. Computing the past and predictive information directly from experimental data, we find that the Escherichia coli chemotaxis network operates far from the information bound, but that it is optimally tuned to predict concentration changes in shallow gradients under a resource constraint. Finally, we investigate the accuracy of the classical Gaussian approximation to the mutual information rate and determine when it can be reliably used, and when other approximations or exact computations are required.