An Arithmetic Logic Unit (ALU) is at the heart of any processor. The ALU is a machine, much like a washing machine, but instead of clothes or towels, soap, and water. The ALU tastes two operands, an operation code (op-code) and a clock. Just like a washing machine the ALU takes a certain amount of time to operate and the faster it works the more laundry (or calculations) get completed in a set amount of time. Here we will design a small ALU.
An ALU has a set of operations it can perform. The small ALU we are making will be able to handle addition, subtraction, bitwise xor, bitwise and, and bitwise not. Most of these operations take two operands which are also inputs to the small ALU and given an op-code the result is calculated then provided as output.
The entity declaration defines the inputs and outputs of the small ALU block. If you think of a block diagram the ports going into the block and the ports coming out the block are defined here. For this ALU block we also have a generic, which is used in this case to declare the width of the operands, op-code, and result bus. If the generic variable is set to eight bits when we have a simple micro processor. If the generic is set to 64 bits then we have the possibility of making a high performance processor. We will stick with the small 8-bit processor (ALU_8) here.
Next the inputs a and b are 8-bits, just like the op-code bus. For the op-code width the 8-bits limit the ALU to 256 op-codes. For this example this will not be a limitation. However, for the operands 8-bits may be a limitation since the operands can be any 8-bit value, since they are provided by the user. The issue occurs when the the operands are added together. If we are adding 8-bit signed numbers, for example, 64+64, the answer is 128, in hex this is 0x40+0x40 = 0x80 but signed 8-bit value of 0x80 is -128. Special care must be taken to ensure calculations are not corrupted with data rolling-over.
The last two input signals are the clock and data-valid flags. The clock is used to reference time and used to register signals in d-flipflops. The data-valid flag is asserted when in the op-code and operands are valid and the calculation is completed. The output signals are of course the result of the operation and a data valid flag for the output.
The simulation example below shows how the simple ALU works. At the rising edge of the clock when the input data-valid (w_idv) is logical ‘1’ input a and input b, w_a and w_a respectively, are registered. The next rising edge starts the calculation of the operands for the given op-code, and finally the correct value for the op-code is provided on w_res when the output data-valid flag is asserted.
-- tb_alu.vhd library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity tb_alu is end entity tb_alu; architecture tb of tb_alu is type t_slv8 is array (natural range <>) of std_logic_vector(7 downto 0); constant k_nbits : integer := 8; constant k_nops : integer := 10; constant k_a : t_slv8(0 to k_nops) := (x"01",x"02",x"03",x"04",x"05",x"06",x"07",x"08",x"09",x"0A",x"0B"); constant k_b : t_slv8(0 to k_nops) := (x"42",x"44",x"46",x"48",x"4A",x"4C",x"4E",x"51",x"53",x"55",x"57"); constant k_op : t_slv8(0 to k_nops) := (x"00",x"00",x"00",x"00",x"00",x"00",x"00",x"00",x"00",x"00",x"00"); signal w_clk : std_logic := '0'; signal w_idv : std_logic := '0'; signal w_odv : std_logic := '0'; signal w_a : std_logic_vector(k_nbits-1 downto 0) := (others => '0'); signal w_b : std_logic_vector(k_nbits-1 downto 0) := (others => '0'); signal w_op : std_logic_vector(k_nbits-1 downto 0) := (others => '0'); signal w_res : std_logic_vector(k_nbits-1 downto 0) := (others => '0'); begin u_alu : entity work.alu_8 port map(i_clk => w_clk, i_a => w_a, i_b => w_b, i_op => w_op, i_dv => w_idv, o_res => w_res, o_dv => w_odv ); p_clk : process begin wait for 5 ns; w_clk <= not w_clk; end process; p_mp : process procedure launch_op_test(iop : in std_logic_vector(7 downto 0); signal oa : out std_logic_vector(7 downto 0); signal ob : out std_logic_vector(7 downto 0); signal oop : out std_logic_vector(7 downto 0); signal oidv : out std_logic) is begin oop <= iop; for a in 2 to 2 loop for b in 65 to 66 loop wait until rising_edge(w_clk); oa <= std_logic_vector(to_signed(a,8)); ob <= std_logic_vector(to_signed(b,8)); oidv <= '1'; wait until rising_edge(w_clk); oidv <= '0'; end loop; end loop; end launch_op_test; begin launch_op_test(x"00", w_a, w_b, w_op, w_idv); wait until rising_edge(w_clk); wait until rising_edge(w_clk); launch_op_test(x"01", w_a, w_b, w_op, w_idv); wait until rising_edge(w_clk); wait until rising_edge(w_clk); launch_op_test(x"02", w_a, w_b, w_op, w_idv); wait until rising_edge(w_clk); wait until rising_edge(w_clk); launch_op_test(x"03", w_a, w_b, w_op, w_idv); wait until rising_edge(w_clk); wait until rising_edge(w_clk); launch_op_test(x"04", w_a, w_b, w_op, w_idv); wait until rising_edge(w_clk); wait until rising_edge(w_clk); launch_op_test(x"05", w_a, w_b, w_op, w_idv); wait; end process; end tb;
-- Simple ALU 8-bits library ieee; use ieee.std_logic_1164.all; use ieee.numeric_std.all; entity alu_8 is port(i_clk : in std_logic; i_a : in std_logic_vector(7 downto 0); i_b : in std_logic_vector(7 downto 0); i_op : in std_logic_vector(7 downto 0); i_dv : in std_logic; o_res : out std_logic_vector(7 downto 0); o_dv : out std_logic ); end entity alu_8; architecture rtl of alu_8 is -- Input Registers signal f_a : std_logic_vector(7 downto 0); signal f_b : std_logic_vector(7 downto 0); signal f_op : std_logic_vector(7 downto 0); -- Output Registers signal f_res: std_logic_vector(7 downto 0); -- Data Valid Shift Register signal f_dv : std_logic_vector(3 downto 0) := (others => '0'); -- Define Operation Codes constant op_add : std_logic_vector(7 downto 0) := x"00"; constant op_sub : std_logic_vector(7 downto 0) := x"01"; constant op_xor : std_logic_vector(7 downto 0) := x"02"; constant op_and : std_logic_vector(7 downto 0) := x"03"; constant op_nota: std_logic_vector(7 downto 0) := x"04"; constant op_notb: std_logic_vector(7 downto 0) := x"05"; begin o_res <= f_res; o_dv <= f_dv(1); p_calc_alu : process(i_clk) begin if rising_edge(i_clk) then f_a <= i_a; f_b <= i_b; f_op <= i_op; f_dv <= f_dv(2 downto 0) & i_dv; case f_op is when op_add => f_res <= std_logic_vector(signed(f_a) + signed(f_b)); when op_sub => f_res <= std_logic_vector(signed(f_a) - signed(f_b)); when op_xor => f_res <= f_a xor f_b; when op_and => f_res <= f_a and f_b; when op_nota => f_res <= not(f_a); when op_notb => f_res <= not(f_b); when others => f_res <= (others => '0'); end case; end if; end process; end rtl;