The least mean squared (LMS) algorithm and its variants have been the most often used algorithm in adaptive signal processing. However the LMS algorithm suffers from a high computational complexity, especially with large filter lengths. The Fourier transform-based block normalized LMS (FBNLMS) reduces the computation count by using the discrete Fourier transform (DFT) and exploiting the fast algorithms for implementing the DFT. Even though the savings achieved with the FBNLMS over the direct-LMS implementation are significant, the computational requirements of FBNLMS are still very high, rendering many real-time applications, like audio and video estimation, infeasible. The Hartley transform-based BNLMS (HBNLMS) is found to have a computational complexity much less than, and a memory requirement almost of the same order as, that of the FBNLMS. This paper is based on the cosine and sine symmetric implementation of the discrete Hartley transform (DHT), which is the key in reducing the computational complexity of the FBNLMS by 33% asymptotically (with respect to multiplications). The parallel implementation of the discrete cosine transform (DCT) in turn can lead to more efficient implementations of the HBNLMS. © 2005 IEEE.