Start a new topic

FMC176 Corrupted Waveforms

Hello,

I currently have two FMC176 boards in my posession: serial numbers 0038 (v1.0, MMCX) and 0054 (v1.1, SSMC). Both are producing corrupted waveform data on D0, but the data on D1 is fine.

I am running both boards on an ML605 with the unmodified FMC176/ML605/Ethernet reference design and its unmodified host application. This design outputs a 300MHz sine wave on both D0 and D1.

From a cold start, 0038 looks correct on both channels for the first 30-60 seconds. See the attached file 20130924_FMC176_0038_1.png (yellow trace=D0, green trace=D1). After that, the D0 waveform is corrupted; see 20130924_FMC176_0038_2.png (yellow trace=D0, green trace=D1).

0054 works somewhat more reliably, but still does output corrupted data. For the first ~30-60 minutes of operation in a recent test, I observed no corrupted data. After that, every minute or two it goes through fits where it outputs corrupted data on D0. An example is shown in 20130924_FMC176_0054.png (yellow trace=D0, green trace=D1).

I am blowing a table fan over my setup, as recommended to me by 4DSP staff. Both cards read temperatures under 50C even after extended operation. What possible causes are there for this? Any suggested solutions?

Another observation I've made is that when the waveforms are correct, the amplitudes of the waveforms output by 0038 are larger than those output by 0054. The software and firmware used by both cards is identical. Did something change in the hardware between v1.0 and v1.1?

Thank you!
Jim


Jim,


It's good to hear that you fixed the issue. Thank you for your feedback and let me know if you have any more questinos.


Thanks,
Kyu
Kyu,

I am using ML605 rev D. The steps to reproduce the error are pretty simple.

1) Generate the ISE project using 4DSP's tools.
2) Build the project in ISE 14.1. Program the ML605 using iMPACT.
3) Run the FMC176 app from MS Visual Studio, communicating with the ML605 via Ethernet.
4) Observe the DAC outputs on a scope. If your FMC176 app is (like mine) unmodified, both channels should be outputting a 300MHz sine wave.
5) Set the scope to trigger if the DAC0 output exceeds the normal envelope of the sine wave.
5) Wait and check the scope periodically for a trigger event. On one of my FMC176 boards I did not observe any failures for 30-60 minutes. On the other board it took only 30-60 seconds to see a failure.

I can provide my modified design which pins out the MMCM locked signals if that is useful to you.

I should also let you know that yesterday I got both of my FMC176 boards to work on the board I am using for my product. I discovered that both MMCMs were in fact staying locked when using my carrier board and that the real problem was that the DLL on the AD9129 was failing to lock. For some reason, turning off the duty-cycle correction on the DLL allows it to lock for certain phase settings. Once it locks, it stays locked and the data output is clean.

Jim

Jim,


Could you answer me the following questions?
1. ML605 revision number
2. Steps how you test the FMC176 firmware on ML605


I will try to reproduce the error with the same hardware and fw/sw. If not, we may offer an RMA to see if it happens on our ML605.


Thanks,
Kyu
Kyu,

I have finished testing the design with the MMCM locked signals pinned out. This design does meet timing, and it exhibits the same behavior I observed using ChipScope. The data are correct on DAC0 until the first falling edge of the MMCM locked signal. The MMCM sometimes comes unlocked multiple times per second. Other times it will stay locked for minutes on end. If it stays unlocked for long, of course, the output gets turned off because the OSERDES get reset. But a brief unlocked period seems to be enough to cause some havoc. I have yet to see it stay unlocked for more than 10-20 us.

I have also observed that the output can be corrupt during long periods with a locked MMCM - but only after it has already been unlocked. My guess is that this state is caused by a brief unlocked period which fails to reset the OSERDES. I can see how this might cause the OSERDES to get out of phase with the data frames. Further reinforcing this theory is the fact that during these periods the corrupted output is not changing; the same corrupted data get output repeatedly until the MMCM loses its lock again.

I also did an experiment where I provided the FMC176 with an external 2.458GHz sample clock. This did not appear to change the behavior of the DAC0 MMCM. I came unlocked just as regularly as when using the internal clock.

Jim
The MMCM for DAC0 is located at MMCM_ADV_X0Y0. DAC1's MMCM is located at MMCM_ADV_X0Y3. I commented out the chipscope instances and rebuilt. All timing constraints are met. Unfortunately I still observe corrupted data on DAC0. Is it possible the design is not properly constrained?

I think I will pin out the MMCM locked signals so I can monitor them without using chipscope. It will be useful to see if the MMCM is still coming unlocked even in a design that meets timing. I'll let you know what I find.

Jim,


I just recompiled it with 14.6 (Currently I don't have 14.1) and I don't have any timing failures. It can be because of chipscope signals. I expected the clock can be an issue, but for me it's kinda weird that DAC0 looses a lock while DAC1 does not because DAC0 clock signal is in the inner column which allows the fast path. Where is your DAC0 MMCM located? The cloest location to DAC_DCO_0 is MMCM_ADV_X0Y0. If it is instantiated far from the DAC_DCO_0 pair, you may want to manually locate it.


Kyu
Kyu,

Thanks for the suggestion. I have modified the ML605 reference design; I added a ChipScope ILA to each of the ad9129_phy_v6 instances, monitoring the dcm_locked net. As I think we both expected, the dcm_locked net corresponding to DAC0 is frequently deasserted. The dcm_locked net corresponding to DAC1 stays high.

I have also done an experiment where I power-on and quickly start ChipScope with the trigger on dcm_locked's falling edge. The first time it triggers corresponds to the first appearance of corrupted data on D0.

This makes complete sense given the behavior I've been observing. Why is the DCM coming unlocked? I did notice two timing failures when I built the chipscope-instrumented design. The two constraints which failed are TS_DAC0_DCO_N_0 and TS_DAC1_DCO_P_0. Seems related. Unfortunately my ISE is only licensed up to v14.1 for virtex-6 and so I used 14.1 to build the design. I know 14.4 is what I should be using. Does the design meet timing using 14.4 or do these same timing failures exist in that environment too?

Thank you once again for your help,
Jim


Jim,


We do not have a ref design with chipscope. However, you have source codes and software. You can recompile it with chipscope. My best guess is because of the clock. You may want to chipscope the MMCM lock signals if it loose locks when the data is currupted. I know that ML605 design has a clock_dedicated_route constraint for DAC1. This may cause an issue on DAC 0. Please verify it. Also, design names with "ad9129" are the DAC source codes. You should check if data from the memory to oserdes is also corrupted.


Source code is at C:\Program Files (x86)\4dsp\Common\Firmware\Extracted
Software is at C:\Program Files (x86)\4dsp\4FM Core Development Kit\Plug-Ins


Thanks,
Kyu
Kyu,

The connector looked clean but I tried cleaning it anyways. No change in the behavior. Unfortunately I do not have a second ML605 to try this on.

It is very important that I understand the cause of this issue. I can't simply use the VC707 instead of the ML605. We have designed the FMC176 into our product. When running the FMC176 on our own board, behavior is very similar to the behavior we see when running on the ML605.

Since the boards both work fine on the VC707, this does not appear to be a problem with the FMC176 cards. But clearly something is going awry when running on the ML605. What should my next steps be in troubleshooting the issue? Do you have a version of the reference design instrumented up with ChipScope that I could use to collect some debug data?

Thank you!
Jim

Jim,


Basically they have the same design architecture. I don't see the big difference except the locations of the logic instantiations because of two different FPGA devices. I first thought it may be becuase of the Clock_dedicated_route constraint in the ML605, but this is for the DAC 1. You have an issue with DAC0 and our test does not produce the same issue. Possibly, your ML605 can cause this issue. You may want to try to clean the connector on ML605 or try with the different ML605 if possible.


Thanks,
Kyu
Kyu,

Yes, in the latest version of the BSP, there is an extracted VC707 reference design. I have tested it and it appears to always output clean data. After roughly an hour of monitoring I have yet to observe any corrupted data. Just to be sure, I switched back to the ML605 for a bit and quickly observed corrupted data.

What could be different about the ML605 and VC707 reference designs that might cause this problem?
Thanks again for your attention!

Jim


Jim,


We have a VC707 reference design. If you purchased our FMC176, other reference designs should be also extracted. The file should be in "C:\Program Files (x86)\4dsp\Common\Firmware\Recovery\348_vc707_fmc176"


Thanks,
Kyu
I have a VC707. I also have a couple of Avnet boards: Zynq 7045 Minimodule Plus and the spartan-6 LX 16 board. Both of the Avnet boards only have an LPC connector, so we would only be able to check out DAC0 if we used one of them.

Jim,


Do you have other carrier boards to try? If yes, I can provide the reference design.


Thanks
Kyu
So, what should my next troubleshooting steps be?
Login or Signup to post a comment