OpenCores
URL https://opencores.org/ocsvn/ion/ion/trunk

Subversion Repositories ion

[/] [ion/] [trunk/] [doc/] [src/] [tex/] [cache.tex] - Diff between revs 210 and 221

Go to most recent revision | Show entire file | Details | Blame | View Log

Rev 210 Rev 221
Line 17... Line 17...
    Initialization means mostly marking all D- and I-cache lines as invalid.
    Initialization means mostly marking all D- and I-cache lines as invalid.
    The old R3000 had its own means to achieve this, but this core implements an
    The old R3000 had its own means to achieve this, but this core implements an
    alternative, simplified scheme.\\
    alternative, simplified scheme.\\
 
 
    The standard R3000 cache control flags in the SR are not used, either. Instead,
    The standard R3000 cache control flags in the SR are not used, either. Instead,
    two flags from the SR have been commandeered for cache control.\\
    two flags from the SR have been repurposed for cache control.\\
 
 
\subsection{Cache control flags}
\subsection{Cache control flags}
\label{cache_control_flags}
\label{cache_control_flags}
 
 
    Bits 17 and 16 of the SR are NOT used for their standard R3000 purpose.
    Bits 17 and 16 of the SR are NOT used for their standard R3000 purpose.
Line 65... Line 65...
    entire 32-bit address:
    entire 32-bit address:
 
 
\needspace{10\baselineskip}
\needspace{10\baselineskip}
\begin{verbatim}
\begin{verbatim}
 
 
             ___________ <-- These address bits are NOT in the tag
                _________ <-- These address bits are NOT in the tag
            /           \
            /           \
    31 ..   27| 26 .. 21  |20 ..          12|11  ..        4|3:2|
    31 ..   27| 26 .. 21  |20 ..          12|11  ..        4|3:2|
    +---------+-----------+-----------------+---------------+---+---+
    +---------+-----------+-----------------+---------------+---+---+
    | 5       |           | 9               | 8             | 2 |   |
    | 5       |           | 9               | 8             | 2 |   |
    +---------+-----------+-----------------+---------------+---+---+
    +---------+-----------+-----------------+---------------+---+---+
Line 79... Line 79...
    \end{verbatim}\\
    \end{verbatim}\\
 
 
    Since bits 26 downto 21 are not included in the tag, there will be a
    Since bits 26 downto 21 are not included in the tag, there will be a
    'mirror' effect in the cache. We have effectively split the memory space
    'mirror' effect in the cache. We have effectively split the memory space
    into 32 separate blocks of 1MB which is obviously not enough but will do
    into 32 separate blocks of 1MB which is obviously not enough but will do
    for the initial tests.
    for the initial versions of the core.
 
 
    In subsequent versions of the cache, the tag size needs to be enlarged AND
    In subsequent versions of the cache, the tag size needs to be enlarged AND
    some of the top bits might be omitted when they're not needed to implement
    some of the top bits might be omitted when they're not needed to implement
    the default memory map (namely bit 30 which is always '0').
    the default MIPS memory map (namely bit 30 which is always '0').
 
 
 
 
\section{Memory Controller}
\section{Memory Controller}
\label{memory_controller}
\label{memory_controller}
 
 
Line 120... Line 121...
    decode logic.\\
    decode logic.\\
 
 
    For each address, the memory map logic will supply the following information:
    For each address, the memory map logic will supply the following information:
 
 
\begin{enumerate}
\begin{enumerate}
    \item What kind of memory it is
    \item What kind of memory it is.
    \item How many wait states to use
    \item How many wait states to use.
    \item Whether it is writeable or not (ignored in current version)
    \item Whether it is writeable or not (ignored in current version).
    \item Whether it is cacheable or not (ignored in current version)
    \item Whether it is cacheable or not (ignored in current version).
\end{enumerate}
\end{enumerate}
 
 
    In the present implementation the memory map can't be modified at run time.\\
    In the present implementation the memory map can't be modified at run time.\\
 
 
    These are the currently supported memory types:
    These are the currently supported memory types:
Line 207... Line 208...
              __    __    __    __    __    __    _     __    __    __    __
              __    __    __    __    __    __    _     __    __    __    __
clk         _/  \__/  \__/  \__/  \__/  \__/  \__/ ..._/  \__/  \__/  \__/
clk         _/  \__/  \__/  \__/  \__/  \__/  \__/ ..._/  \__/  \__/  \__/
 
 
cache/ps    ?| (1)             | (2)             | ... | (2)             |??
cache/ps    ?| (1)             | (2)             | ... | (2)             |??
 
 
refill_ctr  ?| 0                                 | ... <  3              |??
refill_ctr  ?| 0                                 | ... |  3              |??
 
 
chip_addr   ?|  210h           |  211h           | ... |  217h           |--
chip_addr   ?|  210h           |  211h           | ... |  217h           |--
 
 
data_rd     -XXXXX  [218h]     XXXXX  [219h]     | ... XXXXX  [217h]     |--
data_rd     -XXXXX  [218h]     XXXXX  [219h]     | ... XXXXX  [217h]     |--
             |<- 2-state sequence              ->|
             |<- 2-state sequence              ->|
Line 227... Line 228...
 
 
Signal \emph{cache/ps} is the current state of the cache state machine, and
Signal \emph{cache/ps} is the current state of the cache state machine, and
in this chronogram it takes the following values:
in this chronogram it takes the following values:
 
 
\begin{enumerate}
\begin{enumerate}
\item idle
 
\item data\_refill\_sram\_0
\item data\_refill\_sram\_0
\item data\_refill\_sram\_1
\item data\_refill\_sram\_1
\end{enumerate}
\end{enumerate}
 
 
Each of the two states reads a halfword from SRAM. The two-state sequence is
Each of the two states reads a halfword from SRAM. The two-state sequence is
Line 242... Line 242...
 
 
 
 
\subsubsection{SRAM interface read cycle timing -- 8-bit interface}
\subsubsection{SRAM interface read cycle timing -- 8-bit interface}
\label{sram_read_cycle_8b}
\label{sram_read_cycle_8b}
 
 
TODO: 8-bit refill procedure to be done.
The refill from an 8-bit static memory is essentially the same as depicted
 
above, except we need to read 4 bytes (over the LSB lines of the static memory
 
data bus) instead of 2 16-bit halfwords. The operation takes correspondingly
 
longer to perform and uses an extra address line but is otherwise identical.
 
 
 
TODO: 8-bit refill chronogram to be done.
 
 
 
 
\subsubsection{SRAM interface write cycle timing}
\subsubsection{16-bit SRAM interface write cycle timing}
\label{sram_write_cycle}
\label{sram_write_cycle}
 
 
The path of the state machine that deals with SRAM writethroughs is linear so
The path of the state machine that deals with SRAM writethroughs is linear so
a state diagram would not be very interesting. As you can see in the source
a state diagram would not be very interesting. As you can see in the source
code, all the states are one clock long except for states
code, all the states are one clock long except for states
Line 259... Line 264...
attribute.\\
attribute.\\
 
 
A general memory write will be 32-bit wide and thus it will take two 16-bit
A general memory write will be 32-bit wide and thus it will take two 16-bit
memory accesses to complete. Unaligned, halfword or byte wide CPU writes might
memory accesses to complete. Unaligned, halfword or byte wide CPU writes might
in some cases be optimized to take only a single 16-bit memory access. This
in some cases be optimized to take only a single 16-bit memory access. This
module does no such optimization.
module does no such optimization yet.
For simplicity, all writethroughs take two 16-bit access cycles, even if one
For simplicity, all writethroughs take two 16-bit access cycles, even if one
of them has both we\_n signals deasserted.\\
of them has both we\_n signals deasserted.\\
 
 
The following chronogram has been copied from a simulation of the 'hello'
The following chronogram has been copied from a simulation of the 'hello'
sample. It's a 32-bit wide write to address 00000430h.
sample. It's a 32-bit wide write to address 00000430h.
Line 272... Line 277...
word). In this particular case, all the four bytes of the long word are written
word). In this particular case, all the four bytes of the long word are written
and so both the we\_n signals are asserted for both halfwords.
and so both the we\_n signals are asserted for both halfwords.
 
 
In this example, the SRAM is being accessed with 1 WS: WE\_N is asserted for
In this example, the SRAM is being accessed with 1 WS: WE\_N is asserted for
two cycles.
two cycles.
Note how a lot of cycles are lost in order to guarantee compliance with the
Note how a lot of cycles are used in order to guarantee compliance with the
setup and hold times of the SRAM against the we, address and data lines.
setup and hold times of the SRAM against the we, address and data lines.
 
 
\needspace{15\baselineskip}
\needspace{15\baselineskip}
\begin{verbatim}
\begin{verbatim}
==== Chronogram 4.3: 16-bit SRAM writethrough, 32-bit wide =================
==== Chronogram 4.3: 16-bit SRAM writethrough, 32-bit wide =================
Line 300... Line 305...
 
 
Signal \emph{cache/ps} is the current state of the cache state machine, and
Signal \emph{cache/ps} is the current state of the cache state machine, and
in this chronogram it takes the following values:
in this chronogram it takes the following values:
 
 
\begin{enumerate}
\begin{enumerate}
\item idle
 
\item data\_writethrough\_sram\_0a
\item data\_writethrough\_sram\_0a
\item data\_writethrough\_sram\_0b
\item data\_writethrough\_sram\_0b
\item data\_writethrough\_sram\_0c
\item data\_writethrough\_sram\_0c
\item data\_writethrough\_sram\_1a
\item data\_writethrough\_sram\_1a
\item data\_writethrough\_sram\_1b
\item data\_writethrough\_sram\_1b
Line 314... Line 318...
 
 
 
 
\section{Known Problems}
\section{Known Problems}
\label{cache_problems}
\label{cache_problems}
 
 
 
    The cache implementation is still provisional and has a number of
 
    acknowledged problems:
 
 
\begin{enumerate}
\begin{enumerate}
\item All parameters hardcoded -- generics are almost ignored.
\item All parameters hardcoded -- generics are almost ignored.
\item SRAM read state machine does not guarantee internal FPGA $T_{hold}$.
\item SRAM read state machine does not guarantee internal FPGA $T_{hold}$.
        In my current target board it works because the FPGA hold times
        In my current target board it works because the FPGA hold times
        (including an input mux
        (including an input mux
        in the parent module) are far smaller than the SRAM response times, but
        in the parent module) are far smaller than the SRAM response times, but
        it would be better to insert an extra cycle after the wait states in
        it would be better to insert an extra cycle after the wait states in
        the sram read state machine.
        the sram read state machine.
 
\item Cache logic mixed with memory controller logic.
\end{enumerate}
\end{enumerate}
 
 
 
 
 No newline at end of file
 No newline at end of file

powered by: WebSVN 2.1.0

© copyright 1999-2024 OpenCores.org, equivalent to Oliscience, all rights reserved. OpenCores®, registered trademark.