The wide range of code rates and code block sizes supported by todays wireless communication standards, together with the requirement for a throughput in the order of Gbps, necessitates sophisticated and highly parallel channel decoder architectures. Code rates specified in the LTE standard, which uses Turbo-Codes, range up to r = 0.94 to maximize the information throughput by transmitting only a minimum amount of parity information, which negatively impacts the error correcting performance. This especially holds for highly parallel hardware architectures. Therefore, the error correcting performance must be traded-off against the degree of parallel processing. State-of-the-art Turbo-Code decoder hardware architectures are optimized on code block level to alleviate this trade-off. In this paper, we follow a cross-layer approach by combining system level knowledge about the rate-matching and the transport block structure in LTE with the bit-level technique of on-the-fly CRC calculation. Thereby, our proposed Turbo-Code decoder hardware architecture achieves coding gains of 0.4–1.8 dB compared to state-of-the-art accross a wide range of code block sizes. For the fully LTE compatible Turbo-Code decoder, we demonstrate a negligible hardware overhead and a resulting high area and energy efficiency and give post place and route synthesis numbers.