WB DDR3 SDRAM Controller
by robfinch on Jul 28, 2016 |
robfinch
Posts: 28 Joined: Sep 29, 2005 Last seen: Nov 18, 2024 |
||
I think the 32 bit access cycles are a bit limiting. Interfacing to the controller is likely to go through another layer in order to gain multiple ports to memory. Most access to the DRAM will be reads to fill cache lines or buffers. It would be better to have a much wider access (128 bits); it allows one to build the system at a slower clock rate than the high-speed memory clock. In a system I�m currently working on 4, 128 bit reads (512 bits) are performed in a pipelined manner in order to get necessary bandwidth.
Also, output of the SDRAM controller may have to drive another memory port controller in order to support multiple ports to memory. Translating 128 bit data to smaller chunks may be handled by this controller. Does the controller use burst mode to access the DDR ram ? What is the burst length ? |
RE: WB DDR3 SDRAM Controller
by dgisselq on Jul 29, 2016 |
dgisselq
Posts: 247 Joined: Feb 20, 2015 Last seen: Oct 24, 2024 |
||
Thank you for writing and sharing your thoughts. I think the core will support your approach nicely.
A couple of thoughts:
I've gotta run now, or I might write longer. Dan |
RE: WB DDR3 SDRAM Controller
by dgisselq on Jul 29, 2016 |
dgisselq
Posts: 247 Joined: Feb 20, 2015 Last seen: Oct 24, 2024 |
||
Gosh, did I say 128MHz? I meant 128 bits ...
|
RE: WB DDR3 SDRAM Controller
by robfinch on Jul 29, 2016 |
robfinch
Posts: 28 Joined: Sep 29, 2005 Last seen: Nov 18, 2024 |
||
:
Rob |
RE: WB DDR3 SDRAM Controller
by dgisselq on Jul 29, 2016 |
dgisselq
Posts: 247 Joined: Feb 20, 2015 Last seen: Oct 24, 2024 |
||
Rob,
I appreciate your inputs!
Is this a controller you would like to use for your projects as well (once completed and working ...)? I would certainly welcome any help with testing the core on other devices, if that's a help you would be able to provide! Just--give me about a week or so to get the core up and running first .... Thanks, Dan |
RE: WB DDR3 SDRAM Controller
by dgisselq on Jul 31, 2016 |
dgisselq
Posts: 247 Joined: Feb 20, 2015 Last seen: Oct 24, 2024 |
||
Rob,
There is one clock lost to getting parameters from the bus. This is due to the typical high fanout of the bus, and the difficulty of doing logic on bus inputs following that fanout. This makes sense if the controller is attached directly to the bus.
It doesn't make sense if the memory controller is being attached directly to a subcontroller, such as you mention above. Whether or not the bus is registered could easily become a parameter of the entire system. It might save 5ns or so--I'm not sure how tight your requirements are.
Of course, if you are like me, my speed requirement is: as fast as possible, and prove to me how fast you can get it. Under that justification, we should put such a parameter in place. :)
Dan |
RE: WB DDR3 SDRAM Controller
by robfinch on Jul 31, 2016 |
robfinch
Posts: 28 Joined: Sep 29, 2005 Last seen: Nov 18, 2024 |
||
I've been wondering if the DDR3 controller could be used to control a DDR2 SDRAM by adjusting the latencies appropriately. I've been trying to establish what the difference is between DDR3 and DDR2 and other than internals like prefetch buffer size, I've not found any. I don't have a DDR3 system to test on, but sometime in the future I may.
I would like to use the controller to further isolate systems from vendor dependence. |
RE: WB DDR3 SDRAM Controller
by robfinch on Jul 31, 2016 |
robfinch
Posts: 28 Joined: Sep 29, 2005 Last seen: Nov 18, 2024 |
||
I thought a diagram of how I currently use a DDR controller might help.
MPMC.png (59 kb)
|
RE: WB DDR3 SDRAM Controller
by dgisselq on Jul 31, 2016 |
dgisselq
Posts: 247 Joined: Feb 20, 2015 Last seen: Oct 24, 2024 |
||
I don't know the differences between DDR2 and DDR3, other than there is already a DDR2 controller on this site. I don't know how well or poorly it works, I haven't tried it. As for your application, that brings up some interesting strategy questions within the controller. Consider:
I am tuning the controller that I am building for an application that would like to read items sequentially through memory. That means that when I activate (open) a row of a bank, I am also likely to activate the next bank over as well, closing both first if necessary. Further, I am not planning to close any banks until either 1) the mandatory refresh period times out and I am forced to close all banks, 2) the bank is open to the wrong row, or 3) the bank prior is being opened and closing this one will help prepare it for continuous reading. The problem is, if you have a group of 8+ actors trying to deal with memory, the sequential access assumption may not make the most sense. It may make more sense to optimize access to the bank around completely random accesses. In that case, you would want to close any bank row that isn't being used to save the 15ns cost of closing once you discover you need it activated for a different row. I'm not going to swear that I have the optimal strategy. It's just the one I'm building right now. I will welcome thoughts others might have as to what strategy makes the most sense. Just some thoughts to consider--indeed, thoughts that you don't get to consider with a proprietary memory controller. Dan |
RE: WB DDR3 SDRAM Controller
by robfinch on Aug 2, 2016 |
robfinch
Posts: 28 Joined: Sep 29, 2005 Last seen: Nov 18, 2024 |
||
Although there's 8+ actors in the system, I think I can arrange the one with the greatest sequential access requirements (the bitmap display) to stay within a single bank (or two if page flipping). And arrange the system so other things are not accessing the bank.
Just looking at the SDRAM controller as it is now, it looks like one must write all 128 bits for a write transaction. There doesn't seem to be a i_wb_sel line for byte write enables (or even 32 bit word enables). This would mean that data updates have to be read-modify-write cycles. |
RE: WB DDR3 SDRAM Controller
by dgisselq on Aug 2, 2016 |
dgisselq
Posts: 247 Joined: Feb 20, 2015 Last seen: Oct 24, 2024 |
||
Yeah, I tend not to use the sel lines and do everything at 32-bits. I intend to discuss this, and my reason(s) for it at the upcoming ORCONF. That said, the _sel lines would be extremely easy to implement in this case and even with this controller. They would come in and get treated like data up until the point that the data lines get placed on the bus. At that point, the lines would be placed onto the udm and ldm lines. IIRC, there's a direct translation, so again putting _sel lines in would not be too difficult. Even better, from my standpoint as someone who doesn't use them, I think Xilinx would be smart enough to optimize them out. As a result, it wouldn't hurt if they were there. This would give you 8-bit write resolution into the memory. As far as 32-bit word enables, they are actually accomplished within the memory controller. Remember, I've been saying that 32-bits is the natural word size of this controller. It really is. Even if you want to do 128-bit transactions, you're going to find yourself doing a series of 32-bit transactions. To do 32-bit transactions with this controller, and to leave out one or two values from a 128-bit word, just write to the controller all the words you want to write to the memory: one word per clock, as in the pipeline wishbone spec. If you want to skip a 32-bit word--skip it. The logic within the controller will detect that and mask out the missing write. No further work necessary. Sure, you could disagree with me and rewrite the controller to remove this logic to simplify it for 128-bit writes but ... well, that's the wonderful thing about GPL. If you go the road I just described, do be aware that when writing a 128-bit word, if you don't have a word to write on every clock you may find yourself using multiple 128-bit transactions, which would take nearly twice as long. Hence if you wish to write w[0], w[2], w[3], where w[0:3] make up a 128-bit word, write each of w[0], w[1], and w[3] on seperate consecutive clocks. Don't stall on the w[2] clock and pick it up again with w[3]--write w[3] on the third clock. The controller will then write to the memory, and mask out the writing of w[2]. (Actually, if this is a "new" transaction, whose bank is not "activated", the controller is likely to stall for a cycle or two after you've written w[0] and w[1]. You might still get w[2] in before the controller notices it's missing. This wouldn't work, though, for a missing w[1] or w[3].) Dan |
RE: WB DDR3 SDRAM Controller
by dgisselq on Aug 2, 2016 |
dgisselq
Posts: 247 Joined: Feb 20, 2015 Last seen: Oct 24, 2024 |
||
Sigh, you know I do profread my answers before sending them ...
The example should read if you wish to write w[0], w[1], and w[3]--not w[0], w[2] and w[3]--otherwise the example isn't consistent.
Sorry.
Dan |
RE: WB DDR3 SDRAM Controller
by olof on Aug 2, 2016 |
olof
Posts: 218 Joined: Feb 10, 2010 Last seen: Dec 17, 2018 |
||
Hi guys,
I've been meaning to enter the discussion sooner, but haven't had time to do that. I have done some work on this problem a few years ago, but was never able to finish it. I actually revisited the code quite recently with the ambition to push something. I started out from the excellent wb_sdram_ctrl core (https://github.com/skristiansson/wb_sdram_ctrl/) that we have been using a lot for SDRAM interfaces for many OpenRISC-based SoCs. This provides a SDRAM interface in one end and multiple wishbone slave ports in the other. The wishbone ports all contains a small cache (which is coherent between the ports). The reason for the cache is to always do full bursts to the RAM in order to mitigate some of the latency that we would have from sequential single word accesses. The work I was doing for that IP was to split it up into three different cores. 1. A SDRAM phy with a DFI interface. DFI (http://ddr-phy.org/) is a standard interface between a controller and a phy to separate the technology-specific phy and and the (possibly) technology-independent controller. This would allow us to just switch out the phy depending on the FPGA and SDRAM/DDR* chips we want to interface. 2. A controller that can be configured for SDRAM/DDR2/DDR3 operations. Much of the controller is the same for all technologies. The differences is mostly the init state machine and some commands 3. A multiport cached wishbone arbiter. This could be used as a system cache. Likely towards a memory controller, but in theory, anything could be hooked up to this. I'd love to finish this work myself as I've come a pretty long way, but I don't really have the time right now. I'm happy though to share the code I got right now, and discuss this in more detail. For more in-depth discussions, I'm usually available on the #openrisc channel on irc.freenode.net //Olof |
RE: WB DDR3 SDRAM Controller
by dgisselq on Aug 3, 2016 |
dgisselq
Posts: 247 Joined: Feb 20, 2015 Last seen: Oct 24, 2024 |
||
Rob,
I noticed that Xilinx orders their memory addresses as BANK:ROW:COLUMN. If this is how you were judging whether your peripherals might stay within a given bank, I should warn you that this is not how I am ordering memory. I am ordering memory as in ROW:BANK:COLUMN.
The reason for this is simply pipeline performance. Before I get to the end of a column, I can close and open the next bank. This allows me to do continuous read/writing without stalling. Had I instead used BANK:ROW:COLUMN, then I would need to stall at the end of the COLUMN to close the row within the bank, and to open a new row within the same bank. This is inefficient speed wise, and so have chosen the other approach.
Olof,
Your description sounds very familiar to what Rob is interested in. Thanks for pointing it out!
Dan |
RE: WB DDR3 SDRAM Controller
by robfinch on Aug 3, 2016 |
robfinch
Posts: 28 Joined: Sep 29, 2005 Last seen: Nov 18, 2024 |
||
I should warn you that this is not how I am ordering memory. I am ordering memory as in ROW:BANK:COLUMN.
Thanks Dan, I was watching out for that. I will likely end up using a modified version of the controller. I'd like to at least add select lines to it. Olof
I started out from the excellent wb_sdram_ctrl core (https://github.com/skristiansson/wb_sdram_ctrl/) that we have been using a lot ...
1. A SDRAM phy with a DFI interface. DFI (http://ddr-phy.org/) ... 2. A controller that can be configured for SDRAM/DDR2/DDR3 operations. ... This generally sounds like what I'm after.
3. A multiport cached wishbone arbiter. This could be used as a system cache. ...
Is this a single cache with multiple ports ? I've been using separate read-nano-caches for each port. In a version of the controller there are several parallel caches for the cpu port in order to support different threads. Thread-id determines cache used. Writes go directly through to the SDRAM controller. (I believe the vendor's DRAM controller queues writes). I'm also using slightly different port configurations based on the port number/purpose. The multi-port controller is non-generic in nature at the moment. |