URL
https://opencores.org/ocsvn/open8_urisc/open8_urisc/trunk
Subversion Repositories open8_urisc
[/] [open8_urisc/] [trunk/] [VHDL/] [o8_cpu.vhd] - Rev 279
Go to most recent revision | Compare with Previous | Blame | View Log
-- Copyright (c)2006, 2011, 2012, 2013, 2015, 2019, 2020 Jeremy Seth Henry -- All rights reserved. -- -- Redistribution and use in source and binary forms, with or without -- modification, are permitted provided that the following conditions are met: -- * Redistributions of source code must retain the above copyright -- notice, this list of conditions and the following disclaimer. -- * Redistributions in binary form must reproduce the above copyright -- notice, this list of conditions and the following disclaimer in the -- documentation and/or other materials provided with the distribution, -- where applicable (as part of a user interface, debugging port, etc.) -- -- THIS SOFTWARE IS PROVIDED BY JEREMY SETH HENRY ``AS IS'' AND ANY -- EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED -- WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE -- DISCLAIMED. IN NO EVENT SHALL JEREMY SETH HENRY BE LIABLE FOR ANY -- DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES -- (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; -- LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND -- ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -- (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF -- THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -- -- VHDL Units : o8_cpu -- Description: VHDL model of a RISC 8-bit processor core loosely based on the -- : V8/ARC uRISC instruction set. Requires Open8_pkg.vhd -- : -- Notes : Generic definitions -- : -- : Program_Start_Addr sets the initial value of the program -- : counter. -- : -- : ISR_Start_Addr sets the location of the interrupt service -- : vector table. There are 8 service vectors, or 16 bytes, which -- : must be allocated to either ROM or RAM. -- : -- : Stack_Start_Address sets the initial (reset) value of the -- : stack pointer. Also used for the RSP instruction if -- : Allow_Stack_Address_Move is false. -- : -- : Allow_Stack_Address_Move, when set true, allows the RSP to be -- : programmed via thet RSP instruction. If enabled, the -- : instruction changes into TSX or TXS based on the flag -- : specified by STACK_XFER_FLAG. If the flag is '0', RSP will -- : copy the current stack pointer to R1:R0 (TSX). If the flag -- : is '1', RSP will copy R1:R0 to the stack pointer (TXS). This -- : allows the processor to backup and restore stack pointers -- : in a multi-process environment. Note that no flags are -- : modified by either form of this instruction. -- : -- : STACK_XFER_FLAG instructs the core to use the specified ALU -- : flag to alter the behavior of the RSP instruction when -- : Allow_Stack_Address_Move is set TRUE, otherwise it's ignored. -- : While technically any of the status bits may be used, the -- : intent was to use FL_GP[1,2,3,4], as these are not modified -- : by ordinary ALU operations. -- : -- : The Enable_Auto_Increment generic can be used to modify the -- : indexed instructions such that specifying an odd register -- : will use the next lower register pair, post-incrementing the -- : value in that pair. IOW, specifying STX R1 will instead -- : result in STX R0++, or R0 = {R1:R0}; {R1:R0} + 1 -- : -- : BRK_Implements_WAI modifies the BRK instruction such that it -- : triggers the wait for interrupt state, but without triggering -- : a soft interrupt in lieu of its normal behavior, which is to -- : insert several dead clock cycles - essentially a long NOP -- : -- : Enable_NMI overrides the mask bit for interrupt 0, creating a -- : non-maskable interrupt at the highest priority. To remain -- : true to the original core, this should be set false. -- : -- : Sequential_Interrupts, when set, prevents interrupt service -- : routines from being interrupted by postponing an later -- : interrupts until the I bit is cleared (usually with an RTI, -- : but a CLP PSR_I will also work). This is potentially -- : dangerous, as it means a lower-priority ISR can "hog" the CPU -- : by failing to return. However, it can also prevent the -- : condition of an ISR interrupting itself until it causes a -- : memory fault. (For example, an interrupt source that whose -- : period is shorter than the ISR service time) Note that this -- : setting alters the way the pending logic works, so it affects -- : all interrupts, including the NMI. If this is set, special -- : care should be taken to make sure ISRs are short and always -- : execute an RTI at the end. -- : -- : RTI_Ignores_GP_Flags alters the set of flag bits restored -- : after an interrupt. By default, all of the flag bits are put -- : back to their original state. If this flag is set true, only -- : the lower four bits are restored, allowing ISR code to alter -- : the GP flags persistently. -- : -- : Supervisor_Mode, when set, disables the STP PSR_I instruction -- : preventing code from setting the I bit. When enabled, only -- : interrupts can set the I bit, allowing for more robust memory -- : protection by preventing errant code execution from -- : inadvertently entering an interrupt state. -- : -- : This setting also sets I bit at startup so that any -- : initialization code may be run in an ISR context, initially -- : bypassing memory protection. Init code should clear the I bit -- : when done; -- : -- : Unsigned_Index_Offsets alters the way offsets are added to -- : [Rn+1:Rn] during LDO/STO instructions. The original, default -- : behavior treats these offsets as signed values, allowing -- : instructions to offset by -128 to +127 from [Rn+1:Rn]. -- : Setting this generic to TRUE will switch to unsigned offsets, -- : switching the range to 0 to 255 instead. -- : -- : Default_Interrupt_Mask sets the intial/reset value of the -- : interrupt mask. To remain true to the original core, which -- : had no interrupt mask, this should be set to x"FF". Otherwise -- : it can be initialized to any value. Note that Enable_NMI -- : will logically force the LSB high. -- : -- : Reset_Level determines whether the processor registers reset -- : on a high or low level from higher logic. -- : -- : Architecture notes -- : This model deviates from the original ISA in a few important -- : ways. -- : -- : First, there is only one set of registers. Interrupt service -- : routines must explicitely preserve context since the the -- : hardware doesn't. This was done to decrease size and code -- : complexity. Older code that assumes this behavior will not -- : execute correctly on this processor model. -- : -- : Second, this model adds an additional pipeline stage between -- : the instruction decoder and the ALU. Unfortunately, this -- : means that the instruction stream has to be restarted after -- : any math instruction is executed, implying that any ALU -- : instruction now has a latency of 2 instead of 0. The -- : advantage is that the maximum frequency has gone up -- : significantly, as the ALU code is vastly more efficient. -- : As an aside, this now means that all math instructions, -- : including MUL (see below) and UPP have the same instruction -- : latency. -- : -- : Third, the original ISA, also a soft core, had two reserved -- : instructions, USR and USR2. These have been implemented as -- : DBNZ, and MUL respectively. -- : -- : DBNZ decrements the specified register and branches if the -- : result is non-zero. The instruction effectively executes a -- : DEC Rn instruction prior to branching, so the same flags will -- : be set. -- : -- : MUL places the result of R0 * Rn into R1:R0. Instruction -- : latency is identical to other ALU instructions. Only the Z -- : flag is set, since there is no defined overflow or "negative -- : 16-bit values" -- : -- : Fourth, indexed load/store instructions now have an (optional) -- : ability to post-increment their index registers. If enabled, -- : using an odd operand for LDO,LDX, STO, STX will cause the -- : register pair to be incremented after the storage access. -- : -- : Fifth, the RSP instruction has been (optionally) altered to -- : allow the stack pointer to be sourced from R1:R0. -- : -- : Sixth, the BRK instruction can optionally implement a WAI, -- : which is the same as the INT instruction without the soft -- : interrupt, as a way to put the processor to "sleep" until the -- : next external interrupt. -- : -- : Seventh, the original CPU model had 8 non-maskable interrupts -- : with priority. This model has the same 8 interrupts, but -- : allows software to mask them (with an additional option to -- : override the highest priority interrupt, making it the NMI.) -- : -- : Lastly, previous unmapped instructions in the OP_STK opcode -- : were repurposed to support a new interrupt mask. -- : SMSK and GMSK transfer the contents of R0 (accumulator) -- : to/from the interrupt mask register. SMSK is immediate, while -- : GMSK has the same overhead as a math instruction. -- -- Revision History -- Author Date Change ------------------ -------- --------------------------------------------------- -- Seth Henry 07/19/06 Design Start -- Seth Henry 01/18/11 Fixed BTT instruction to match V8 -- Seth Henry 07/22/11 Fixed interrupt transition logic to avoid data -- corruption issues. -- Seth Henry 07/26/11 Optimized logic in ALU, stack pointer, and data -- path sections. -- Seth Henry 07/27/11 Optimized logic for timing, merged blocks into -- single entity. -- Seth Henry 09/20/11 Added BRK_Implements_WAI option, allowing the -- processor to wait for an interrupt instead of the -- normal BRK behavior. -- Seth Henry 12/20/11 Modified core to allow WAI_Cx state to idle -- the bus entirely (Rd_Enable is low) -- Seth Henry 02/03/12 Replaced complex interrupt controller with simpler, -- faster logic that simply does priority encoding. -- Seth Henry 08/06/13 Removed HALT functionality -- Seth Henry 10/29/15 Fixed inverted carry logic in CMP and SBC instrs -- Seth Henry 12/19/19 Renamed to o8_cpu to fit "theme" -- Seth Henry 03/09/20 Modified RSP instruction to work with a CPU flag -- allowing true backup/restore of the stack pointer -- Seth Henry 03/11/20 Split the address logic from the main state machine -- in order to simplify things and eliminate -- redundancies. Came across and fixed a problem with -- the STO instruction when Enable_Auto_Increment is -- NOT set. -- Seth Henry 03/12/20 Rationalized the naming of the CPU flags to match -- the assembler names. Also fixed an issue where -- the I bit wasn't being cleared after interrupts. -- Simplified the program counter logic to only use -- the offset for increments, redefining the -- original modes as fixed offset values. -- Modified the ALU section with a new ALU operation -- for GMSK. This allowed the .data field to be -- removed and Operand1 used in its place, which -- simplified the logic a great deal. -- Seth Henry 03/16/20 Added CPU_Halt input back, only now as an input to -- the instruction decode state, where it acts as a -- modified form of the BRK instruction that holds -- state until CPU_Halt is deasserted. This has a -- much smaller impact on Fmax/complexity than the -- original clock enable, but imposes a mild impact -- due to the need to reset the instruction pipeline -- Seth Henry 03/17/20 Added generic to control whether RTI full restores -- the flags, including the general purpose ones, or -- only the core ALU flags (Z, N, and C). Also -- brought out copies of the GP flags for external -- connection. -- Seth Henry 04/09/20 Added a compile time setting to block interrupts -- while the I bit is set to avoid reentering ISRs -- This may slightly affect timing, as this will -- potentially block higher priority interrupts -- until the lower priority ISR returns or clears -- the I bit. -- Also added the I bit to the exported flags for -- use in memory protection schemes. -- Seth Henry 04/16/20 Modified to use new Open8 bus record. Also added -- reset and usec_tick logic to drive utility -- signals. Also added Halt_Ack output. -- Seth Henry 05/20/20 Added two new generics to alter the way the I bit -- is handled. The Supervisor_Mode setting disables -- STP PSR_I from being executed, preventing it -- from being set outside of an ISR. The -- Default_Int_Flag setting allows the I bit to -- start set so that initialization code can run, -- but not be hijacked later to corrupt any memory -- write protection later. -- Seth Henry 05/21/20 Supervisor_Mode now protects the interrupt mask -- and stack pointer as well. -- Seth Henry 05/24/20 Removed the Default_Int_Flag, as it is covered by -- Supervisor_Mode. If Supervisor_Mode isn't set, -- code can simply use STP to set the bit -- Seth Henry 06/09/20 Added ability to use unsigned index offsets for -- LDO/STO. Also pipelined the address calculation -- for indexed instructions, reducing the final -- address generator to a multiplexor fed only by -- registers. -- Seth Henry 07/10/20 Fixed a bug in the LDO/LDX logic where the register -- pair wasn't being incremented properly due to a -- missing UPP2 signal to the ALU. -- Seth Henry 10/21/20 Modified the write data path to use separate -- enumerated states rather than reuse the .reg field -- to improve performance. -- Seth Henry 10/23/20 Moved CPU internal constants to o8_cpu.vhd. Also -- removed Stack_Xfer_Flag, which specified the CPU -- flag used to alter the RSP instruction, making it -- a constant instead (PSR_GP4). This eliminated the -- need to expose an internal constant externally library ieee; use ieee.std_logic_1164.all; use ieee.std_logic_unsigned.all; use ieee.std_logic_arith.all; use ieee.std_logic_misc.all; library work; use work.Open8_pkg.all; entity o8_cpu is generic( Program_Start_Addr : ADDRESS_TYPE := x"0000"; -- Initial PC location ISR_Start_Addr : ADDRESS_TYPE := x"FFF0"; -- Bottom of ISR vec's Stack_Start_Addr : ADDRESS_TYPE := x"03FF"; -- Top of Stack Allow_Stack_Address_Move : boolean := false; -- Use Normal v8 RSP Enable_Auto_Increment : boolean := false; -- Modify indexed instr BRK_Implements_WAI : boolean := false; -- BRK -> Wait for Int Enable_NMI : boolean := true; -- Force INTR0 enabled Sequential_Interrupts : boolean := false; -- Interruptable ISRs RTI_Ignores_GP_Flags : boolean := false; -- RTI sets all flags Supervisor_Mode : boolean := false; -- I bit is restricted Unsigned_Index_Offsets : boolean := false; -- Offsets are signed Default_Interrupt_Mask : DATA_TYPE := x"FF"; -- Enable all Ints Clock_Frequency : real -- Clock Frequency ); port( Clock : in std_logic; PLL_Locked : in std_logic; -- Halt_Req : in std_logic := '0'; Halt_Ack : out std_logic; -- Open8_Bus : out OPEN8_BUS_TYPE; Rd_Data : in DATA_TYPE; Interrupts : in INTERRUPT_BUNDLE := x"00" ); end entity; architecture behave of o8_cpu is signal Reset_q : std_logic := Reset_Level; signal Reset : std_logic := Reset_Level; constant USEC_VAL : integer := integer(Clock_Frequency / 1000000.0); constant USEC_WDT : integer := ceil_log2(USEC_VAL - 1); constant USEC_DLY : std_logic_vector := conv_std_logic_vector(USEC_VAL - 1, USEC_WDT); signal uSec_Cntr : std_logic_vector( USEC_WDT - 1 downto 0 ); signal uSec_Tick : std_logic; -- CPU Instruction Set Definitions subtype OPCODE_TYPE is std_logic_vector(4 downto 0); subtype SUBOP_TYPE is std_logic_vector(2 downto 0); -- All opcodes should be identical to the opcode used by the assembler -- In this case, they match the original V8/ARC uRISC ISA constant OP_INC : OPCODE_TYPE := "00000"; constant OP_ADC : OPCODE_TYPE := "00001"; constant OP_TX0 : OPCODE_TYPE := "00010"; constant OP_OR : OPCODE_TYPE := "00011"; constant OP_AND : OPCODE_TYPE := "00100"; constant OP_XOR : OPCODE_TYPE := "00101"; constant OP_ROL : OPCODE_TYPE := "00110"; constant OP_ROR : OPCODE_TYPE := "00111"; constant OP_DEC : OPCODE_TYPE := "01000"; constant OP_SBC : OPCODE_TYPE := "01001"; constant OP_ADD : OPCODE_TYPE := "01010"; constant OP_STP : OPCODE_TYPE := "01011"; constant OP_BTT : OPCODE_TYPE := "01100"; constant OP_CLP : OPCODE_TYPE := "01101"; constant OP_T0X : OPCODE_TYPE := "01110"; constant OP_CMP : OPCODE_TYPE := "01111"; constant OP_PSH : OPCODE_TYPE := "10000"; constant OP_POP : OPCODE_TYPE := "10001"; constant OP_BR0 : OPCODE_TYPE := "10010"; constant OP_BR1 : OPCODE_TYPE := "10011"; constant OP_DBNZ : OPCODE_TYPE := "10100"; -- USR constant OP_INT : OPCODE_TYPE := "10101"; constant OP_MUL : OPCODE_TYPE := "10110"; -- USR2 constant OP_STK : OPCODE_TYPE := "10111"; constant OP_UPP : OPCODE_TYPE := "11000"; constant OP_STA : OPCODE_TYPE := "11001"; constant OP_STX : OPCODE_TYPE := "11010"; constant OP_STO : OPCODE_TYPE := "11011"; constant OP_LDI : OPCODE_TYPE := "11100"; constant OP_LDA : OPCODE_TYPE := "11101"; constant OP_LDX : OPCODE_TYPE := "11110"; constant OP_LDO : OPCODE_TYPE := "11111"; -- OP_STK uses the lower 3 bits to further refine the instruction by -- repurposing the source register field. These "sub opcodes" take -- the place of the register select for the OP_STK opcode constant SOP_RSP : SUBOP_TYPE := "000"; constant SOP_RTS : SUBOP_TYPE := "001"; constant SOP_RTI : SUBOP_TYPE := "010"; constant SOP_BRK : SUBOP_TYPE := "011"; constant SOP_JMP : SUBOP_TYPE := "100"; constant SOP_SMSK : SUBOP_TYPE := "101"; constant SOP_GMSK : SUBOP_TYPE := "110"; constant SOP_JSR : SUBOP_TYPE := "111"; -- These should match the assembler's definitions for the flags constant PSR_Z : integer := 0; constant PSR_C : integer := 1; constant PSR_N : integer := 2; constant PSR_I : integer := 3; constant PSR_GP4 : integer := 4; constant PSR_GP5 : integer := 5; constant PSR_GP6 : integer := 6; constant PSR_GP7 : integer := 7; -- Internal CPU Signals & Constants type CPU_STATES is ( -- Instruction fetch & Decode IPF_C0, IPF_C1, IPF_C2, IDC_C0, -- Branching BRN_C1, DBNZ_C1, JMP_C1, JMP_C2, -- Loads LDA_C1, LDA_C2, LDA_C3, LDA_C4, LDI_C1, LDO_C1, LDO_C2, LDX_C1, LDX_C2, LDX_C3, LDX_C4, -- Stores STA_C1, STA_C2, STA_C3, STO_C1, STO_C2, STO_C3, STX_C1, STX_C2, -- 2-cycle math MUL_C1, UPP_C1, -- Stack PSH_C1, POP_C1, POP_C2, POP_C3, POP_C4, -- Subroutines & Interrupts WAI_Cx, WAH_Cx, BRK_C1, ISR_C1, ISR_C2, ISR_C3, JSR_C1, JSR_C2, RTS_C1, RTS_C2, RTS_C3, RTS_C4, RTS_C5, RTI_C6 ); type CACHE_MODES is (CACHE_IDLE, CACHE_INSTR, CACHE_OPER1, CACHE_OPER2, CACHE_PREFETCH ); type PC_MODES is ( PC_INCR, PC_LOAD ); type PC_CTRL_TYPE is record Oper : PC_MODES; Offset : DATA_TYPE; end record; -- These are fixed constant offsets to the program counter logic, which is -- always either incrementing or loading. constant PC_NEXT : DATA_TYPE := x"03"; constant PC_IDLE : DATA_TYPE := x"02"; constant PC_REV1 : DATA_TYPE := x"01"; constant PC_REV2 : DATA_TYPE := x"00"; constant PC_REV3 : DATA_TYPE := x"FF"; type SP_MODES is ( SP_IDLE, SP_CLR, SP_SET, SP_POP, SP_PUSH ); type SP_CTRL_TYPE is record Oper : SP_MODES; end record; -- This constant determines which CPU flag is used to switch the -- direction of the modified RSP instruction constant STACK_XFER_FLAG : integer := PSR_GP4; -- GP4 modifies RSP type DP_MODES is ( DATA_BUS_IDLE, DATA_RD_MEM, DATA_WR_REG, DATA_WR_FLAG, DATA_WR_PC_L, DATA_WR_PC_H ); type DATA_CTRL_TYPE is record Src : DP_MODES; Reg : SUBOP_TYPE; end record; type INT_CTRL_TYPE is record Mask_Set : std_logic; Soft_Ints : INTERRUPT_BUNDLE; Incr_ISR : std_logic; end record; -- Most of the ALU instructions are the same as their Opcode equivalents, -- with exceptions for IDLE, UPP2, RFLG, RSP, and GMSK, which perform -- internal operations not otherwise exposed by the instruction set. constant ALU_INC : OPCODE_TYPE := "00000"; -- x"00" constant ALU_ADC : OPCODE_TYPE := "00001"; -- x"01" constant ALU_TX0 : OPCODE_TYPE := "00010"; -- x"02" constant ALU_OR : OPCODE_TYPE := "00011"; -- x"03" constant ALU_AND : OPCODE_TYPE := "00100"; -- x"04" constant ALU_XOR : OPCODE_TYPE := "00101"; -- x"05" constant ALU_ROL : OPCODE_TYPE := "00110"; -- x"06" constant ALU_ROR : OPCODE_TYPE := "00111"; -- x"07" constant ALU_DEC : OPCODE_TYPE := "01000"; -- x"08" constant ALU_SBC : OPCODE_TYPE := "01001"; -- x"09" constant ALU_ADD : OPCODE_TYPE := "01010"; -- x"0A" constant ALU_STP : OPCODE_TYPE := "01011"; -- x"0B" constant ALU_BTT : OPCODE_TYPE := "01100"; -- x"0C" constant ALU_CLP : OPCODE_TYPE := "01101"; -- x"0D" constant ALU_T0X : OPCODE_TYPE := "01110"; -- x"0E" constant ALU_CMP : OPCODE_TYPE := "01111"; -- x"0F" constant ALU_POP : OPCODE_TYPE := "10001"; -- x"11" constant ALU_MUL : OPCODE_TYPE := "10110"; -- x"16" constant ALU_UPP : OPCODE_TYPE := "11000"; -- x"18" constant ALU_LDI : OPCODE_TYPE := "11100"; -- x"1C" constant ALU_IDLE : OPCODE_TYPE := "10000"; -- x"10" constant ALU_UPP2 : OPCODE_TYPE := "10010"; -- x"12" constant ALU_RFLG : OPCODE_TYPE := "10011"; -- x"13" constant ALU_RSP : OPCODE_TYPE := "10111"; -- x"17" constant ALU_GMSK : OPCODE_TYPE := "11111"; -- x"1F" type ALU_CTRL_TYPE is record Oper : OPCODE_TYPE; Reg : SUBOP_TYPE; end record; constant ACCUM : SUBOP_TYPE := "000"; type REGFILE_TYPE is array (0 to 7) of DATA_TYPE; subtype FLAG_TYPE is DATA_TYPE; signal CPU_Next_State : CPU_STATES := IPF_C0; signal CPU_State : CPU_STATES := IPF_C0; signal CPU_Halt_Req : std_logic := '0'; signal CPU_Halt_Ack : std_logic := '0'; signal Cache_Ctrl : CACHE_MODES := CACHE_IDLE; signal Opcode : OPCODE_TYPE := (others => '0'); signal SubOp, SubOp_p1 : SUBOP_TYPE := (others => '0'); signal Prefetch : DATA_TYPE := x"00"; signal Operand1, Operand2 : DATA_TYPE := x"00"; signal Instr_Prefetch : std_logic := '0'; signal PC_Ctrl : PC_CTRL_TYPE; signal Program_Ctr : ADDRESS_TYPE := x"0000"; signal ALU_Ctrl : ALU_CTRL_TYPE; signal Regfile : REGFILE_TYPE; signal Flags : FLAG_TYPE; signal Mult : ADDRESS_TYPE := x"0000"; signal SP_Ctrl : SP_CTRL_TYPE; signal Stack_Ptr : ADDRESS_TYPE := x"0000"; signal DP_Ctrl : DATA_CTRL_TYPE; signal INT_Ctrl : INT_CTRL_TYPE; signal Ack_D, Ack_Q, Ack_Q1: std_logic := '0'; signal Int_Req, Int_Ack : std_logic := '0'; signal Set_Mask : std_logic := '0'; signal Int_Mask : DATA_TYPE := x"00"; signal i_Ints : INTERRUPT_BUNDLE := x"00"; signal Pending : INTERRUPT_BUNDLE := x"00"; signal Wait_for_FSM : std_logic := '0'; signal Wait_for_ISR : std_logic := '0'; alias ISR_Addr_Base is ISR_Start_Addr(15 downto 4); signal ISR_Addr_Offset : std_logic_vector(3 downto 0) := x"0"; constant INT_VECTOR_0 : std_logic_vector(3 downto 0) := x"0"; constant INT_VECTOR_1 : std_logic_vector(3 downto 0) := x"2"; constant INT_VECTOR_2 : std_logic_vector(3 downto 0) := x"4"; constant INT_VECTOR_3 : std_logic_vector(3 downto 0) := x"6"; constant INT_VECTOR_4 : std_logic_vector(3 downto 0) := x"8"; constant INT_VECTOR_5 : std_logic_vector(3 downto 0) := x"A"; constant INT_VECTOR_6 : std_logic_vector(3 downto 0) := x"C"; constant INT_VECTOR_7 : std_logic_vector(3 downto 0) := x"E"; signal IDX_Offset_SX : std_logic := '0'; signal IDX_Offset : ADDRESS_TYPE := x"0000"; signal IDX_Sel_l : std_logic_vector(2 downto 0) := "000"; signal IDX_Sel_h : std_logic_vector(2 downto 0) := "000"; signal IDX_Reg_l : integer := 0; signal IDX_Reg_h : integer := 0; signal IDX_NoOffset_Calc : ADDRESS_TYPE := x"0000"; signal IDX_Offset_Calc : ADDRESS_TYPE := x"0000"; begin ------------------------------------------------------------------------------- -- Reset & uSec Tick ------------------------------------------------------------------------------- CPU_Reset_Sync: process( Clock, PLL_Locked ) begin if( PLL_Locked = '0' )then Reset_q <= Reset_Level; Reset <= Reset_Level; elsif( rising_edge(Clock) )then Reset_q <= not Reset_Level; Reset <= Reset_q; end if; end process; uSec_Tick_proc: process( Clock, Reset ) begin if( Reset = Reset_Level )then uSec_Cntr <= USEC_DLY; uSec_Tick <= '0'; elsif( rising_edge( Clock ) )then uSec_Cntr <= uSec_Cntr - 1; if( or_reduce(uSec_Cntr) = '0' )then uSec_Cntr <= USEC_DLY; end if; uSec_Tick <= nor_reduce(uSec_Cntr); end if; end process; Open8_Bus.Clock <= Clock; Open8_Bus.Reset <= Reset; Open8_Bus.uSec_Tick <= uSec_Tick; ------------------------------------------------------------------------------- -- Address bus selection/generation logic ------------------------------------------------------------------------------- -- Address selection logic based on current CPU state. This is combinatorial, -- as adding pipeline registration would add a clock cycle to every instr, -- without really adding the Fmax to compensate. Address_Logic: process(CPU_State, Operand1, Operand2, IDX_NoOffset_Calc, IDX_Offset_Calc, ISR_Addr_Offset, Stack_Ptr, Program_Ctr ) begin case( CPU_State )is when LDA_C2 | STA_C2 => Open8_Bus.Address <= Operand2 & Operand1; when LDX_C1 | STX_C1 => Open8_Bus.Address <= IDX_NoOffset_Calc; when LDO_C2 | STO_C2 => Open8_Bus.Address <= IDX_Offset_Calc; when ISR_C1 | ISR_C2 => Open8_Bus.Address <= ISR_Addr_Base & ISR_Addr_Offset; when PSH_C1 | POP_C1 | ISR_C3 | JSR_C1 | JSR_C2 | RTS_C1 | RTS_C2 | RTS_C3 => Open8_Bus.Address <= Stack_Ptr; when others => Open8_Bus.Address <= Program_Ctr; end case; end process; -- The original model treated the offset to LDO/STO as a signed value -- allowing access to locations -128 to +127 from [Rn+1:Rn]. This isn't -- always helpful, so the generic allows the CPU to use unsigned math -- for the offsets. This makes the range 0 to +255 instead. IDX_Offset_SX <= '0' when Unsigned_Index_Offsets else Operand1(7); IDX_Offset(15 downto 8) <= (others => IDX_Offset_SX); IDX_Offset(7 downto 0) <= Operand1; -- Enable_Auto_Increment uses the LSB to determine whether or not to -- do the auto-increment, so we need to lock the LSB for each operand -- if it is enabled. This forces [ODD:EVEN] pairing. IDX_Sel_l <= (SubOp(2 downto 1) & '0') when Enable_Auto_Increment else SubOp; IDX_Sel_h <= (SubOp(2 downto 1) & '1') when Enable_Auto_Increment else SubOp_p1; IDX_Reg_l <= conv_integer(IDX_Sel_l); IDX_Reg_h <= conv_integer(IDX_Sel_h); -- Pipeline registers for the indexed and indexed with offset addresses. Idx_Addr_Calc_proc: process( Clock, Reset ) variable Reg, Reg_1 : integer range 0 to 7 := 0; begin if( Reset = Reset_Level )then IDX_NoOffset_Calc <= x"0000"; IDX_Offset_Calc <= x"0000"; elsif( rising_edge(Clock))then IDX_NoOffset_Calc <= (Regfile(IDX_Reg_h) & Regfile(IDX_Reg_l)); IDX_Offset_Calc <= (Regfile(IDX_Reg_h) & Regfile(IDX_Reg_l)) + IDX_Offset; end if; end process; ------------------------------------------------------------------------------- -- Combinatorial portion of CPU finite state machine -- State Logic / Instruction Decoding & Execution ------------------------------------------------------------------------------- State_Logic: process(CPU_State, Flags, Int_Mask, CPU_Halt_Req, Opcode, SubOp , SubOp_p1, Operand1, Operand2, Int_Req ) variable Reg : integer range 0 to 7 := 0; begin CPU_Next_State <= CPU_State; Cache_Ctrl <= CACHE_IDLE; -- PC_Ctrl.Oper <= PC_INCR; PC_Ctrl.Offset <= PC_IDLE; -- ALU_Ctrl.Oper <= ALU_IDLE; ALU_Ctrl.Reg <= ACCUM; -- SP_Ctrl.Oper <= SP_IDLE; -- DP_Ctrl.Src <= DATA_RD_MEM; DP_Ctrl.Reg <= ACCUM; -- INT_Ctrl.Mask_Set <= '0'; INT_Ctrl.Soft_Ints <= x"00"; INT_Ctrl.Incr_ISR <= '0'; Ack_D <= '0'; -- Reg := conv_integer(SubOp); -- CPU_Halt_Ack <= '0'; case CPU_State is ------------------------------------------------------------------------------- -- Initial Instruction fetch & decode ------------------------------------------------------------------------------- when IPF_C0 => CPU_Next_State <= IPF_C1; PC_Ctrl.Offset <= PC_NEXT; when IPF_C1 => CPU_Next_State <= IPF_C2; PC_Ctrl.Offset <= PC_NEXT; when IPF_C2 => CPU_Next_State <= IDC_C0; Cache_Ctrl <= CACHE_INSTR; PC_Ctrl.Offset <= PC_NEXT; when IDC_C0 => CPU_Next_State <= IDC_C0; Cache_Ctrl <= CACHE_INSTR; case Opcode is when OP_PSH => CPU_Next_State <= PSH_C1; Cache_Ctrl <= CACHE_PREFETCH; PC_Ctrl.Offset <= PC_REV1; DP_Ctrl.Src <= DATA_WR_REG; DP_Ctrl.Reg <= SubOp; when OP_POP => CPU_Next_State <= POP_C1; Cache_Ctrl <= CACHE_PREFETCH; PC_Ctrl.Offset <= PC_REV2; SP_Ctrl.Oper <= SP_POP; when OP_BR0 | OP_BR1 => CPU_Next_State <= BRN_C1; Cache_Ctrl <= CACHE_OPER1; PC_Ctrl.Offset <= PC_NEXT; when OP_DBNZ => CPU_Next_State <= DBNZ_C1; Cache_Ctrl <= CACHE_OPER1; PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_DEC; ALU_Ctrl.Reg <= SubOp; when OP_INT => PC_Ctrl.Offset <= PC_NEXT; -- Make sure the requested interrupt is actually enabled first. -- Also, unlike CPU_Halt, the INT instruction is actually being -- executed, so go ahead and increment the program counter before -- pausing so the CPU restarts on the next instruction. if( Int_Mask(Reg) = '1' )then CPU_Next_State <= WAI_Cx; INT_Ctrl.Soft_Ints(Reg) <= '1'; end if; when OP_STK => case SubOp is when SOP_RSP => PC_Ctrl.Offset <= PC_NEXT; if( not Allow_Stack_Address_Move )then -- The default behavior for this instruction is to simply -- repoint the SP to the HDL default SP_Ctrl.Oper <= SP_CLR; end if; if( Allow_Stack_Address_Move and Flags(STACK_XFER_FLAG) = '1' )then -- If RSP is set to allow SP moves, and the specified flag -- is true, then signal the stack pointer logic to load -- from R1:R0 SP_Ctrl.Oper <= SP_SET; end if; if( Allow_Stack_Address_Move and Flags(STACK_XFER_FLAG) = '0')then -- If RSP is set to allow SP moves, and the specified flag -- is false, then signal the ALU to copy the stack pointer -- to R1:R0 ALU_Ctrl.Oper <= ALU_RSP; end if; when SOP_RTS | SOP_RTI => CPU_Next_State <= RTS_C1; Cache_Ctrl <= CACHE_IDLE; SP_Ctrl.Oper <= SP_POP; when SOP_BRK => if( BRK_Implements_WAI )then -- If BRK_Implements_WAI, then jump to the WAI_Cx and -- increment the PC similar to an ISR flow. CPU_Next_State <= WAI_Cx; PC_Ctrl.Offset <= PC_NEXT; else -- If Break is implemented normally, back the PC up by -- 2 and return through IPF_C0 in order to execute a 3 -- clock cycle delay CPU_Next_State <= BRK_C1; PC_Ctrl.Offset <= PC_REV2; end if; when SOP_JMP => CPU_Next_State <= JMP_C1; Cache_Ctrl <= CACHE_OPER1; when SOP_SMSK => PC_Ctrl.Offset <= PC_NEXT; INT_Ctrl.Mask_Set <= '1'; when SOP_GMSK => PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_GMSK; when SOP_JSR => CPU_Next_State <= JSR_C1; Cache_Ctrl <= CACHE_OPER1; DP_Ctrl.Src <= DATA_WR_PC_H; when others => null; end case; when OP_MUL => CPU_Next_State <= MUL_C1; -- Multiplication requires a single clock cycle to calculate PRIOR -- to the ALU writing the result to registers. As a result, this -- state needs to idle the ALU initially, and back the PC up by 1 -- We can get away with only 1 extra clock by pre-fetching the -- next instruction, though. Cache_Ctrl <= CACHE_PREFETCH; PC_Ctrl.Offset <= PC_REV1; -- Note that both the multiply process AND ALU process need the -- source register for Rn (R1:R0 = R0 * Rn). Assert ALU_Ctrl.reg -- now, but hold off on the ALU command until the next state. ALU_Ctrl.Oper <= ALU_IDLE; ALU_Ctrl.Reg <= SubOp; when OP_UPP => CPU_Next_State <= UPP_C1; Cache_Ctrl <= CACHE_PREFETCH; PC_Ctrl.Offset <= PC_REV1; ALU_Ctrl.Oper <= Opcode; ALU_Ctrl.Reg <= SubOp; when OP_LDA => CPU_Next_State <= LDA_C1; Cache_Ctrl <= CACHE_OPER1; when OP_LDI => CPU_Next_State <= LDI_C1; Cache_Ctrl <= CACHE_OPER1; PC_Ctrl.Offset <= PC_NEXT; when OP_LDO => CPU_Next_State <= LDO_C1; Cache_Ctrl <= CACHE_OPER1; PC_Ctrl.Offset <= PC_REV2; when OP_LDX => CPU_Next_State <= LDX_C1; Cache_Ctrl <= CACHE_PREFETCH; PC_Ctrl.Offset <= PC_REV2; when OP_STA => CPU_Next_State <= STA_C1; Cache_Ctrl <= CACHE_OPER1; when OP_STO => CPU_Next_State <= STO_C1; Cache_Ctrl <= CACHE_OPER1; PC_Ctrl.Offset <= PC_REV1; when OP_STX => CPU_Next_State <= STX_C1; Cache_Ctrl <= CACHE_PREFETCH; PC_Ctrl.Offset <= PC_REV2; DP_Ctrl.Src <= DATA_WR_REG; DP_Ctrl.Reg <= ACCUM; when OP_STP => PC_Ctrl.Offset <= PC_NEXT; if( Supervisor_Mode )then if( SubOp /= PSR_I )then ALU_Ctrl.Oper <= Opcode; ALU_Ctrl.Reg <= SubOp; end if; else ALU_Ctrl.Oper <= Opcode; ALU_Ctrl.Reg <= SubOp; end if; when others => PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= Opcode; ALU_Ctrl.Reg <= SubOp; end case; if( Int_Req = '1' )then CPU_Next_State <= ISR_C1; end if; if( CPU_Halt_Req = '1' )then CPU_Next_State <= WAH_Cx; end if; -- If either of these override conditions are true, the decoder needs -- to undo everything it just setup, since even "single-cycle" -- instructions will be executed again upon return. if( Int_Req = '1' or CPU_Halt_Req = '1' )then -- In either case, we want to skip loading the cache, as the cache -- will be invalid by the time we get back. Cache_Ctrl <= CACHE_IDLE; -- Rewind the PC by 3 to put the PC back to the current instruction, -- compensating for the pipeline registers. PC_Ctrl.Offset <= PC_REV3; -- Reset all of the sub-block controls to IDLE, to avoid unintended -- operation due to the current instruction. ALU_Ctrl.Oper <= ALU_IDLE; SP_Ctrl.Oper <= SP_IDLE; -- Interrupt logic outside of the state machine needs this to be set -- to DATA_RD_MEM, while CPU_Halt considers this a "don't care". DP_Ctrl.Src <= DATA_RD_MEM; -- If an INT/SMSK instruction was going to be executed, it will get -- executed again when normal processing resumes, so axe their -- requests for now. INT_Ctrl.Mask_Set <= '0'; INT_Ctrl.Soft_Ints(Reg) <= '0'; end if; ------------------------------------------------------------------------------- -- Program Control (BRx, BNx, DBNZ, JMP ) ------------------------------------------------------------------------------- when BRN_C1 => CPU_Next_State <= IDC_C0; Cache_Ctrl <= CACHE_INSTR; PC_Ctrl.Offset <= PC_NEXT; if( Flags(Reg) = Opcode(0) )then CPU_Next_State <= IPF_C0; Cache_Ctrl <= CACHE_IDLE; PC_Ctrl.Offset <= Operand1; end if; when DBNZ_C1 => CPU_Next_State <= IDC_C0; Cache_Ctrl <= CACHE_INSTR; PC_Ctrl.Offset <= PC_NEXT; if( Flags(PSR_Z) = '0' )then CPU_Next_State <= IPF_C0; Cache_Ctrl <= CACHE_IDLE; PC_Ctrl.Offset <= Operand1; end if; when JMP_C1 => CPU_Next_State <= JMP_C2; Cache_Ctrl <= CACHE_OPER2; when JMP_C2 => CPU_Next_State <= IPF_C0; PC_Ctrl.Oper <= PC_LOAD; ------------------------------------------------------------------------------- -- Data Storage - Load from memory (LDA, LDI, LDO, LDX) ------------------------------------------------------------------------------- when LDA_C1 => CPU_Next_State <= LDA_C2; Cache_Ctrl <= CACHE_OPER2; when LDA_C2 => CPU_Next_State <= LDA_C3; when LDA_C3 => CPU_Next_State <= LDA_C4; PC_Ctrl.Offset <= PC_NEXT; when LDA_C4 => CPU_Next_State <= LDI_C1; Cache_Ctrl <= CACHE_OPER1; PC_Ctrl.Offset <= PC_NEXT; when LDI_C1 => CPU_Next_State <= IDC_C0; Cache_Ctrl <= CACHE_INSTR; PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_LDI; ALU_Ctrl.Reg <= SubOp; when LDO_C1 => CPU_Next_State <= LDO_C2; when LDO_C2 => CPU_Next_State <= LDX_C2; PC_Ctrl.Offset <= PC_NEXT; if( Enable_Auto_Increment and SubOp(0) = '1' )then ALU_Ctrl.Oper <= ALU_UPP; ALU_Ctrl.Reg <= SubOp(2 downto 1) & '0'; end if; when LDX_C1 => CPU_Next_State <= LDX_C2; if( Enable_Auto_Increment and SubOp(0) = '1' )then ALU_Ctrl.Oper <= ALU_UPP; ALU_Ctrl.Reg <= SubOp(2 downto 1) & '0'; end if; when LDX_C2 => CPU_Next_State <= LDX_C3; if( Enable_Auto_Increment and SubOp(0) = '1' )then ALU_Ctrl.Oper <= ALU_UPP2; ALU_Ctrl.Reg <= SubOp(2 downto 1) & '1'; end if; PC_Ctrl.Offset <= PC_NEXT; when LDX_C3 => CPU_Next_State <= LDX_C4; Cache_Ctrl <= CACHE_OPER1; PC_Ctrl.Offset <= PC_NEXT; when LDX_C4 => CPU_Next_State <= IDC_C0; Cache_Ctrl <= CACHE_INSTR; PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_LDI; ALU_Ctrl.Reg <= ACCUM; ------------------------------------------------------------------------------- -- Data Storage - Store to memory (STA, STO, STX) ------------------------------------------------------------------------------- when STA_C1 => CPU_Next_State <= STA_C2; Cache_Ctrl <= CACHE_OPER2; DP_Ctrl.Src <= DATA_WR_REG; DP_Ctrl.Reg <= SubOp; when STA_C2 => CPU_Next_State <= STA_C3; PC_Ctrl.Offset <= PC_NEXT; when STA_C3 => CPU_Next_State <= IPF_C2; Cache_Ctrl <= CACHE_PREFETCH; PC_Ctrl.Offset <= PC_NEXT; when STO_C1 => CPU_Next_State <= STO_C2; Cache_Ctrl <= CACHE_PREFETCH; DP_Ctrl.Src <= DATA_WR_REG; DP_Ctrl.Reg <= ACCUM; when STO_C2 => CPU_Next_State <= IPF_C1; PC_Ctrl.Offset <= PC_NEXT; if( Enable_Auto_Increment and SubOp(0) = '1' )then CPU_Next_State <= STO_C3; ALU_Ctrl.Oper <= ALU_UPP; ALU_Ctrl.Reg <= SubOp(2 downto 1) & '0'; end if; when STO_C3 => CPU_Next_State <= IPF_C2; PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_UPP2; ALU_Ctrl.Reg <= SubOp(2 downto 1) & '1'; when STX_C1 => CPU_Next_State <= IPF_C1; PC_Ctrl.Offset <= PC_NEXT; if( Enable_Auto_Increment and SubOp(0) = '1' )then CPU_Next_State <= STX_C2; ALU_Ctrl.Oper <= ALU_UPP; ALU_Ctrl.Reg <= SubOp(2 downto 1) & '0'; end if; when STX_C2 => CPU_Next_State <= IPF_C2; PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_UPP2; ALU_Ctrl.Reg <= SubOp(2 downto 1) & '1'; ------------------------------------------------------------------------------- -- Multi-Cycle Math Operations (UPP, MUL) ------------------------------------------------------------------------------- -- Because we have to backup the pipeline by 1 to refetch the 2nd -- instruction/first operand, we have to return through PF2. Also, we -- need to tell the ALU to store the results to R1:R0 here. Note that -- there is no ALU_Ctrl.Reg, as this is implied in the ALU instruction when MUL_C1 => CPU_Next_State <= IPF_C2; PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_MUL; when UPP_C1 => CPU_Next_State <= IPF_C2; PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_UPP2; ALU_Ctrl.Reg <= SubOp_p1; ------------------------------------------------------------------------------- -- Basic Stack Manipulation (PSH, POP, RSP) ------------------------------------------------------------------------------- when PSH_C1 => CPU_Next_State <= IPF_C1; SP_Ctrl.Oper <= SP_PUSH; when POP_C1 => CPU_Next_State <= POP_C2; when POP_C2 => CPU_Next_State <= POP_C3; PC_Ctrl.Offset <= PC_NEXT; when POP_C3 => CPU_Next_State <= POP_C4; Cache_Ctrl <= CACHE_OPER1; PC_Ctrl.Offset <= PC_NEXT; when POP_C4 => CPU_Next_State <= IDC_C0; Cache_Ctrl <= CACHE_INSTR; PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_POP; ALU_Ctrl.Reg <= SubOp; ------------------------------------------------------------------------------- -- Subroutines & Interrupts (RTS, JSR) ------------------------------------------------------------------------------- when WAI_Cx => -- For soft interrupts only, halt the Program_Ctr DP_Ctrl.Src <= DATA_BUS_IDLE; if( Int_Req = '1' )then CPU_Next_State <= ISR_C1; -- Rewind the PC by 3 to put the PC back to would have been the next -- instruction, compensating for the pipeline registers. PC_Ctrl.Offset <= PC_REV3; DP_Ctrl.Src <= DATA_RD_MEM; end if; when WAH_Cx => -- Holds until CPU_Halt_Req is deasserted. CPU_Halt_Ack <= '1'; DP_Ctrl.Src <= DATA_BUS_IDLE; if( CPU_Halt_Req = '0' )then CPU_Next_State <= IPF_C0; DP_Ctrl.Src <= DATA_RD_MEM; end if; when BRK_C1 => -- Debugging (BRK) Performs a 5-clock NOP. CPU_Next_State <= IPF_C0; when ISR_C1 => CPU_Next_State <= ISR_C2; INT_Ctrl.Incr_ISR <= '1'; when ISR_C2 => CPU_Next_State <= ISR_C3; DP_Ctrl.Src <= DATA_WR_FLAG; when ISR_C3 => CPU_Next_State <= JSR_C1; Cache_Ctrl <= CACHE_OPER1; ALU_Ctrl.Oper <= ALU_STP; ALU_Ctrl.Reg <= conv_std_logic_vector(PSR_I,3); SP_Ctrl.Oper <= SP_PUSH; DP_Ctrl.Src <= DATA_WR_PC_H; Ack_D <= '1'; when JSR_C1 => CPU_Next_State <= JSR_C2; Cache_Ctrl <= CACHE_OPER2; SP_Ctrl.Oper <= SP_PUSH; DP_Ctrl.Src <= DATA_WR_PC_L; when JSR_C2 => CPU_Next_State <= IPF_C0; PC_Ctrl.Oper <= PC_LOAD; SP_Ctrl.Oper <= SP_PUSH; when RTS_C1 => CPU_Next_State <= RTS_C2; SP_Ctrl.Oper <= SP_POP; when RTS_C2 => CPU_Next_State <= RTS_C3; -- if this is an RTI, then we need to POP the flags if( SubOp = SOP_RTI )then SP_Ctrl.Oper <= SP_POP; end if; when RTS_C3 => CPU_Next_State <= RTS_C4; Cache_Ctrl <= CACHE_OPER1; when RTS_C4 => CPU_Next_State <= RTS_C5; Cache_Ctrl <= CACHE_OPER2; when RTS_C5 => CPU_Next_State <= IPF_C0; PC_Ctrl.Oper <= PC_LOAD; -- if this is an RTI, then we need to clear the I bit if( SubOp = SOP_RTI )then CPU_Next_State <= RTI_C6; Cache_Ctrl <= CACHE_OPER1; ALU_Ctrl.Oper <= ALU_CLP; ALU_Ctrl.Reg <= conv_std_logic_vector(PSR_I,3); end if; when RTI_C6 => CPU_Next_State <= IPF_C1; PC_Ctrl.Offset <= PC_NEXT; ALU_Ctrl.Oper <= ALU_RFLG; when others => null; end case; end process; ------------------------------------------------------------------------------- -- Registered portion of CPU finite state machine ------------------------------------------------------------------------------- CPU_Regs: process( Reset, Clock ) variable Offset_SX : ADDRESS_TYPE; variable i_Ints : INTERRUPT_BUNDLE := x"00"; variable Index : integer range 0 to 7 := 0; variable Sum : std_logic_vector(8 downto 0) := "000000000"; variable Temp : std_logic_vector(8 downto 0) := "000000000"; begin if( Reset = Reset_Level )then CPU_State <= IPF_C0; CPU_Halt_Req <= '0'; Halt_Ack <= '0'; Opcode <= OP_INC; SubOp <= ACCUM; SubOp_p1 <= ACCUM; Operand1 <= x"00"; Operand2 <= x"00"; Instr_Prefetch <= '0'; Prefetch <= x"00"; Open8_Bus.Wr_En <= '0'; Open8_Bus.Wr_Data <= OPEN8_NULLBUS; Open8_Bus.Rd_En <= '1'; Program_Ctr <= Program_Start_Addr; Stack_Ptr <= Stack_Start_Addr; Ack_Q <= '0'; Ack_Q1 <= '0'; Int_Ack <= '0'; Int_Req <= '0'; Pending <= x"00"; Wait_for_FSM <= '0'; Wait_for_ISR <= '0'; Set_Mask <= '0'; if( Enable_NMI )then Int_Mask <= Default_Interrupt_Mask(7 downto 1) & '1'; else Int_Mask <= Default_Interrupt_Mask; end if; ISR_Addr_Offset <= INT_VECTOR_0; for i in 0 to 7 loop Regfile(i) <= x"00"; end loop; Flags <= x"00"; if( Supervisor_Mode )then Flags(PSR_I) <= '1'; end if; Open8_Bus.GP_Flags <= (others => '0'); elsif( rising_edge(Clock) )then CPU_State <= CPU_Next_State; -- Register the halt request and acknowledge lines CPU_Halt_Req <= Halt_Req; Halt_Ack <= CPU_Halt_Ack; ------------------------------------------------------------------------------- -- Instruction/Operand caching for pipelined memory access ------------------------------------------------------------------------------- -- To avoid putting too much load on the (usually massive) wire-OR'd bus, -- the CPU loads Rd_Data into one of four registers - instruction, -- operand 1 or 2, or the instruction prefetch registers. The first is -- used to decode an instruction when the prefetch isn't valid, while -- the two operand registers are used to hold any additional argument -- for multi-byte instructions. Because of the memory pipelining, some -- longer instructions can cache the next instruction as part of their -- execution in a prefetch register, allowing the CPU to skip loading -- it again later. Unfortunate, because instructions aren't all the same -- length, it is not feasible to cache their operands without adding a -- second partial decode stage that would obviate any savings. case Cache_Ctrl is when CACHE_INSTR => Opcode <= Rd_Data(7 downto 3); SubOp <= Rd_Data(2 downto 0); SubOp_p1 <= Rd_Data(2 downto 0) + 1; if( Instr_Prefetch = '1' )then Opcode <= Prefetch(7 downto 3); SubOp <= Prefetch(2 downto 0); SubOp_p1 <= Prefetch(2 downto 0) + 1; Instr_Prefetch <= '0'; end if; when CACHE_OPER1 => Operand1 <= Rd_Data; when CACHE_OPER2 => Operand2 <= Rd_Data; when CACHE_PREFETCH => Prefetch <= Rd_Data; Instr_Prefetch <= '1'; when CACHE_IDLE => null; end case; ------------------------------------------------------------------------------- -- Program Counter ------------------------------------------------------------------------------- -- The program counter is a bit unusual in that it always subtracts two -- from itself plus the signed offset. This is because of the way the -- assembler works when computing branches. Thus, to "IDLE" the counter, -- the offset is set to 2, while "NEXT" sets the offset to 3. Depending -- on how an instruction interacts with memory, or is pipelined, the -- offset can vary from -1 to 3 Offset_SX(15 downto 8) := (others => PC_Ctrl.Offset(7)); Offset_SX(7 downto 0) := PC_Ctrl.Offset; case PC_Ctrl.Oper is when PC_INCR => Program_Ctr <= Program_Ctr + Offset_SX - 2; when PC_LOAD => Program_Ctr <= Operand2 & Operand1; when others => null; end case; ------------------------------------------------------------------------------- -- (Write) Data Path ------------------------------------------------------------------------------- -- Note that this code handles both the Rd_En and Wr_En signals. These -- were separated to make downstream logic simpler (As opposed to the -- more classic RD_WRn and ADDR_STROBE scheme) It is also true to the -- original core, which also had separate read and write enable outputs Open8_Bus.Wr_En <= '0'; Open8_Bus.Wr_Data <= OPEN8_NULLBUS; Open8_Bus.Rd_En <= '0'; case DP_Ctrl.Src is when DATA_BUS_IDLE => null; when DATA_RD_MEM => Open8_Bus.Rd_En <= '1'; when DATA_WR_REG => Open8_Bus.Wr_En <= '1'; Open8_Bus.Wr_Data <= Regfile(conv_integer(DP_Ctrl.Reg)); when DATA_WR_FLAG => Open8_Bus.Wr_En <= '1'; Open8_Bus.Wr_Data <= Flags; when DATA_WR_PC_L => Open8_Bus.Wr_En <= '1'; Open8_Bus.Wr_Data <= Program_Ctr(7 downto 0); when DATA_WR_PC_H => Open8_Bus.Wr_En <= '1'; Open8_Bus.Wr_Data <= Program_Ctr(15 downto 8); when others => null; end case; ------------------------------------------------------------------------------- -- Stack Pointer ------------------------------------------------------------------------------- case SP_Ctrl.Oper is when SP_IDLE => null; when SP_CLR => Stack_Ptr <= Stack_Start_Addr; when SP_SET => if( Supervisor_Mode )then if( Flags(PSR_I) = '1' )then Stack_Ptr <= Regfile(1) & Regfile(0); end if; else Stack_Ptr <= Regfile(1) & Regfile(0); end if; when SP_POP => Stack_Ptr <= Stack_Ptr + 1; when SP_PUSH => Stack_Ptr <= Stack_Ptr - 1; when others => null; end case; ------------------------------------------------------------------------------- -- Interrupt Controller ------------------------------------------------------------------------------- -- If Supervisor_Mode is set, restrict the SMSK instruction such that it -- requires the I bit to be set. if( Supervisor_Mode )then Set_Mask <= INT_Ctrl.Mask_Set and Flags(PSR_I); else Set_Mask <= INT_Ctrl.Mask_Set; end if; -- The interrupt control mask is always sourced out of R0 if( Set_Mask = '1' )then if( Enable_NMI )then Int_Mask <= Regfile(conv_integer(ACCUM))(7 downto 1) & '1'; else Int_Mask <= Regfile(conv_integer(ACCUM)); end if; end if; -- Combine external and internal interrupts, and mask the OR of the two -- with the mask. Record any incoming interrupts to the pending buffer i_Ints := (Interrupts or INT_Ctrl.Soft_Ints) and Int_Mask; Pending <= i_Ints or Pending; -- If Sequential_Interrupts is set true, Wait_for_ISR should follow the -- I bit, preventing a new interrupt from starting until the I bit is -- cleared. if( Sequential_Interrupts )then Wait_for_ISR <= Flags(PSR_I); else Wait_for_ISR <= '0'; end if; if( Wait_for_FSM = '0' and Wait_for_ISR = '0' )then if( Pending(0) = '1' )then ISR_Addr_Offset <= INT_VECTOR_0; Pending(0) <= '0'; elsif( Pending(1) = '1' )then ISR_Addr_Offset <= INT_VECTOR_1; Pending(1) <= '0'; elsif( Pending(2) = '1' )then ISR_Addr_Offset <= INT_VECTOR_2; Pending(2) <= '0'; elsif( Pending(3) = '1' )then ISR_Addr_Offset <= INT_VECTOR_3; Pending(3) <= '0'; elsif( Pending(4) = '1' )then ISR_Addr_Offset <= INT_VECTOR_4; Pending(4) <= '0'; elsif( Pending(5) = '1' )then ISR_Addr_Offset <= INT_VECTOR_5; Pending(5) <= '0'; elsif( Pending(6) = '1' )then ISR_Addr_Offset <= INT_VECTOR_6; Pending(6) <= '0'; elsif( Pending(7) = '1' )then ISR_Addr_Offset <= INT_VECTOR_7; Pending(7) <= '0'; end if; Wait_for_FSM <= or_reduce(Pending); end if; -- Reset the Wait_for_FSM flag on Int_Ack Ack_Q <= Ack_D; Ack_Q1 <= Ack_Q; Int_Ack <= Ack_Q1; if( Int_Ack = '1' )then Wait_for_FSM <= '0'; end if; Int_Req <= Wait_for_FSM and (not Int_Ack); -- Incr_ISR allows the CPU Core to advance the vector address to pop the -- lower half of the address. if( INT_Ctrl.Incr_ISR = '1' )then ISR_Addr_Offset <= ISR_Addr_Offset + 1; end if; ------------------------------------------------------------------------------- -- ALU (Arithmetic / Logic Unit) ------------------------------------------------------------------------------- -- The ALU code is responsible for (and should be the only code altering) -- the register file. Most of the "instructions" directly map to opcodes -- but a few are for internal use only, such as operations involving the -- stack pointer or interrupt mask. Index := conv_integer(ALU_Ctrl.Reg); Sum := (others => '0'); Temp := (others => '0'); case ALU_Ctrl.Oper is when ALU_INC => -- Rn = Rn + 1 : Flags N,C,Z Sum := ("0" & x"01") + ("0" & Regfile(Index)); Flags(PSR_Z) <= nor_reduce(Sum(7 downto 0)); Flags(PSR_C) <= Sum(8); Flags(PSR_N) <= Sum(7); Regfile(Index) <= Sum(7 downto 0); when ALU_UPP => -- Rn = Rn + 1 Sum := ("0" & x"01") + ("0" & Regfile(Index)); Flags(PSR_C) <= Sum(8); Regfile(Index) <= Sum(7 downto 0); when ALU_UPP2 => -- Rn = Rn + C Sum := (x"00" & Flags(PSR_C)) + ("0" & Regfile(Index)); Flags(PSR_C) <= Sum(8); Regfile(Index) <= Sum(7 downto 0); when ALU_ADC => -- R0 = R0 + Rn + C : Flags N,C,Z Sum := ("0" & Regfile(0)) + ("0" & Regfile(Index)) + Flags(PSR_C); Flags(PSR_Z) <= nor_reduce(Sum(7 downto 0)); Flags(PSR_C) <= Sum(8); Flags(PSR_N) <= Sum(7); Regfile(0) <= Sum(7 downto 0); when ALU_TX0 => -- R0 = Rn : Flags N,Z Temp := "0" & Regfile(Index); Flags(PSR_Z) <= nor_reduce(Temp(7 downto 0)); Flags(PSR_N) <= Temp(7); Regfile(0) <= Temp(7 downto 0); when ALU_OR => -- R0 = R0 | Rn : Flags N,Z Temp(7 downto 0) := Regfile(0) or Regfile(Index); Flags(PSR_Z) <= nor_reduce(Temp(7 downto 0)); Flags(PSR_N) <= Temp(7); Regfile(0) <= Temp(7 downto 0); when ALU_AND => -- R0 = R0 & Rn : Flags N,Z Temp(7 downto 0) := Regfile(0) and Regfile(Index); Flags(PSR_Z) <= nor_reduce(Temp(7 downto 0)); Flags(PSR_N) <= Temp(7); Regfile(0) <= Temp(7 downto 0); when ALU_XOR => -- R0 = R0 ^ Rn : Flags N,Z Temp(7 downto 0) := Regfile(0) xor Regfile(Index); Flags(PSR_Z) <= nor_reduce(Temp(7 downto 0)); Flags(PSR_N) <= Temp(7); Regfile(0) <= Temp(7 downto 0); when ALU_ROL => -- Rn = Rn<<1,C : Flags N,C,Z Temp := Regfile(Index) & Flags(PSR_C); Flags(PSR_Z) <= nor_reduce(Temp(7 downto 0)); Flags(PSR_C) <= Temp(8); Flags(PSR_N) <= Temp(7); Regfile(Index) <= Temp(7 downto 0); when ALU_ROR => -- Rn = C,Rn>>1 : Flags N,C,Z Temp := Regfile(Index)(0) & Flags(PSR_C) & Regfile(Index)(7 downto 1); Flags(PSR_Z) <= nor_reduce(Temp(7 downto 0)); Flags(PSR_C) <= Temp(8); Flags(PSR_N) <= Temp(7); Regfile(Index) <= Temp(7 downto 0); when ALU_DEC => -- Rn = Rn - 1 : Flags N,C,Z Sum := ("0" & Regfile(Index)) + ("0" & x"FF"); Flags(PSR_Z) <= nor_reduce(Sum(7 downto 0)); Flags(PSR_C) <= Sum(8); Flags(PSR_N) <= Sum(7); Regfile(Index) <= Sum(7 downto 0); when ALU_SBC => -- Rn = R0 - Rn - C : Flags N,C,Z Sum := ("0" & Regfile(0)) + ("1" & (not Regfile(Index))) + Flags(PSR_C); Flags(PSR_Z) <= nor_reduce(Sum(7 downto 0)); Flags(PSR_C) <= Sum(8); Flags(PSR_N) <= Sum(7); Regfile(0) <= Sum(7 downto 0); when ALU_ADD => -- R0 = R0 + Rn : Flags N,C,Z Sum := ("0" & Regfile(0)) + ("0" & Regfile(Index)); Flags(PSR_C) <= Sum(8); Regfile(0) <= Sum(7 downto 0); Flags(PSR_Z) <= nor_reduce(Sum(7 downto 0)); Flags(PSR_N) <= Sum(7); when ALU_STP => -- Sets bit(n) in the Flags register Flags(Index) <= '1'; when ALU_BTT => -- Z = !R0(N), N = R0(7) Flags(PSR_Z) <= not Regfile(0)(Index); Flags(PSR_N) <= Regfile(0)(7); when ALU_CLP => -- Clears bit(n) in the Flags register Flags(Index) <= '0'; when ALU_T0X => -- Rn = R0 : Flags N,Z Temp := "0" & Regfile(0); Flags(PSR_Z) <= nor_reduce(Temp(7 downto 0)); Flags(PSR_N) <= Temp(7); Regfile(Index) <= Temp(7 downto 0); when ALU_CMP => -- Sets Flags on R0 - Rn : Flags N,C,Z Sum := ("0" & Regfile(0)) + ("1" & (not Regfile(Index))) + '1'; Flags(PSR_Z) <= nor_reduce(Sum(7 downto 0)); Flags(PSR_C) <= Sum(8); Flags(PSR_N) <= Sum(7); when ALU_MUL => -- Stage 1 of 2 {R1:R0} = R0 * Rn : Flags Z Regfile(0) <= Mult(7 downto 0); Regfile(1) <= Mult(15 downto 8); Flags(PSR_Z) <= nor_reduce(Mult); when ALU_LDI => -- Rn <= Data : Flags N,Z Flags(PSR_Z) <= nor_reduce(Operand1); Flags(PSR_N) <= Operand1(7); Regfile(Index) <= Operand1; when ALU_POP => -- Rn <= Data Regfile(Index) <= Operand1; when ALU_RFLG => Flags(3 downto 0) <= Operand1(3 downto 0); if( not RTI_Ignores_GP_Flags )then Flags(7 downto 4)<= Operand1(7 downto 4); end if; when ALU_RSP => Regfile(0) <= Stack_Ptr(7 downto 0); Regfile(1) <= Stack_Ptr(15 downto 8); when ALU_GMSK => Flags(PSR_Z) <= nor_reduce(Int_Mask); Regfile(0) <= Int_Mask; when others => null; end case; Open8_Bus.GP_Flags <= Flags(7 downto 3); end if; end process; ------------------------------------------------------------------------------- -- Multiplier Logic -- -- We need to infer a hardware multipler, so we create a special clocked -- process with no reset or clock enable ------------------------------------------------------------------------------- Multiplier_proc: process( Clock ) begin if( rising_edge(Clock) )then Mult <= Regfile(0) * Regfile(conv_integer(ALU_Ctrl.Reg)); end if; end process; end architecture;
Go to most recent revision | Compare with Previous | Blame | View Log