In this work, we present the first high-performance yet compact RTL design of the HQC cryptographic algorithm, fully compatible with the last specification revision of April 2023, which added implicit rejection calling. Our design improves on the state of the art of the specialized polynomial multipliers and Reed-Solomon/Reed-Muller decoder components, which take the largest share of the HQC computation time. Furthermore, we compare the efficiency of the sparse polynomial sampler proposed by the HQC team with different approaches proposed by the research community. We benchmarked our design employing the Xilinx Artix-7 FPGA line, selected by the US NIST as the reference benchmarking platform for post-quantum cipher implementation. We report improvements in the latency for keygen, encapsulation, and decapsulation operations between 1.58× and 2.93×, and efficiency improvements (in terms of execution time per hardware resources) from 1.23× to 1.62×, with respect to the current state of the art RTL implementations of HQC components. When compared with the specification compliant HLS implementation provided by the HQC team, we achieve speedups between 7.41× and 10.67×, and improve the overall efficiency by a factor between 6.05× and 10.98×.