OpenCores

* Ethernet MAC 10/100 Mbps

Issue List
MAC silently corrupts rx pkts #6
Open TristanSchmelcher opened this issue over 17 years ago
TristanSchmelcher commented over 17 years ago

Hello,

I am using the OpenCores MAC in a commercial embedded system, and I have discovered that it very occasionally will silently corrupt received packets. That is, the data that it writes to memory is corrupted, but it does not report a CRC error. The corruption seems to be very specific: a 12-byte long sequence that starts on at least a 4-byte boundary is shifted into the future by 4 bytes (thus duplicating the first 4 bytes in the sequence and wiping out the 4 bytes that should've come after). I can easily reproduce the problem with a simple flood ping to my device. This is a particularly subversive bug, as it leaves header information intact and can thus cause undetected errors in higher-level protocols (e.g., HTTP transfers, which is how I discovered this).

I have added software CRC checking as a workaround, so this is no longer a problem for me (the packet loss is not a big issue for my application), but I am posting here so that others will know about the problem (and perhaps someone will investigate?).

Note that the bug at http://www.opencores.org/ptracker.cgi/view/ethmac/283 seems similar, but I have tried the fix suggested there by Jun and it does not help.

My system is a Nios II CPU running in an EP2C35F484C7 FPGA. (I'm using the Avalon version of the core from MaCo.) The OS is uClinux 2.6.17-uc1.

ocghost commented over 17 years ago

Are you using Altera's SDRAM controller (SDR, not DDR)? If yes, I may have a fix for that. I had a similar problem that received packets got corrupt because of duplicated (4-byte) data in the packet. Hardware CRC could not catch it because it happens when received packet is written into the memory. Basically the problem was the FIFO of the SDRAM controller and the Wishbone-Avalon wrapper. - Jun

mkl commented over 17 years ago

Your Problem seem to be the same as I am facing. Right now, I am not working on that project anymore, but when I have some time to spare, I probably will do furhter investigations on the problem.

In my project, I did (at fist) not use SDRAM for reception buffers. My board had 1MByte static RAM and I used that for the ethernet buffers. I observed some really strange corrupted packets.

I also suspected, the wishbone-avalon-wrapper could be part of the problem, especially, because my curuuption pattern changed, when I switched from 16-bit-SRAM to 32-bit-SDRAM.

I discovered corrupted packets because my applicatins data was corrupted. I did not use TCP/IP, but some other esoteric protocol with its own checksum mechanism. I had sporadic checksum errors and it took quit long, till I suspected the MAC as the origin. In face, my project was doing an ethernet motnior to find communication problems between two other devices, using this esoteric protocol.

After I changed my RX ISR to do an additional CRC-Check, I was able to discover much more currupted packets, including lots of IP packets, too.

I was able to find one race condition, and I fixed it, but I still have lots of corrupted packets. Because time was running out, I stoped working on that project.

One of the options, I had considered prior to stoping that project, was a switch to the MAC of Gaisler Research, which is also published and licenced under GPL. However, it is written in VHDL (which I do not understand, yet), and it uses the AMBA-bus. I do not kow, wether an adapter to avalon is availabe or easily implementable.

So, as a conclusion, to me it seems, the combination of altera NIO II / OpenCores Ethernet MAC still seems to have serious problems. Since the SDRAM controller might be (at least part of) one of them, I would like to hear more details on that issue.

73, Mario

ocghost commented about 17 years ago

I am using the Altera SDRam Controller and the MaCo 5in1 Ethernet. Could you explain me a bit more about the Fix or provide it ? Thanks.

galland commented over 16 years ago

I have run on this very problem, though what I see is that the first 8 bytes (2 WB writes) are not done. I've tried the code modifications proposed in the bug reports of this section to this date but it was no use. The TX works perfect so I am trying to see if the problem comes from my implementation of the WB interface to the RX buffer. I'll post the results here. Regards, Victor

galland commented over 16 years ago

The solution that finally did solve it is commented by "Jun" in the answers to the bug report called "Race condition in eth_wishbone.v" in this same core


Assignee
No one
Labels
Bug