The production of recombinant proteins has become indispensable for both research and industrial applications. However, the expression of recombinant protein acts as a stress on host strain, resulting in decrease in the rate of growth and hence the productivity of the protein. To improve yield, it is essential to understand the changes in the physiology and metabolism of the host and reverse them by over- or under-expressing the key genes. In this paper, we propose an approach based on Principal Component Analysis to identify the genes differentially expressed in the host strain compared to wild-type strain. These genes provide the information about the changes in the metabolic events due to recombinant protein production. Our approach also identifies the regulators responsible for these changes and hence by over-expressing or knocking-out these regulators, the behavior of the host can be brought to normal. We illustrate the proposed approach using a case study of recombinant protein production in E coli. © 2007 Elsevier B.V. All rights reserved.