One of the most flexible and modular approaches to reconfigurable systems is a bus-based approach. In order to get realistic performance estimates of these systems, detailed modeling of the processor as well as the bus and memory hierarchy is required. In addition, when coupling one or more reconfigurable units with a superscalar, out-of-order issue, load/store RISC CPU using the on-chip system bus, there are issues relating to cache coherency that need to be addressed. We have developed a cycle accurate co-simulator that uses a 'C' model of the processor and HDL models of the bus and reconfigurable units. We have also made modifications to the CPU pipeline to allow for non-cacheable accesses to the reconfigurable unit. This is reported in the paper. We have used this simulator to look at (a) The speedup obtained for two examples, namely, matrix multiplication and Lempel-Ziv compression, (b) the speedup obtained when there is a context switch from one application to the other and full reconfiguration is employed and (c) speedup obtained with partial reconfiguration. These results are reported in the paper. © 2004 Elsevier B.V. All rights reserved.