MobileNetV1 - "size of variable 'srl' is too large to handle" from Q_srl.v at step "step_out_of_context_synthesis" #1260

kalahel · 2025-01-10T14:41:50Z

Prerequisites

Please make sure to check off these prerequisites before submitting a bug report.

The bug appears on the current main-branch.
Check that the issue hasn't already been reported, by checking the currently open issues.
If there are steps to reproduce the problem, make sure to write them down below.
If relevant, please include the ONNX files, which were created directly before and/or after the bug.

Quick summary

I'm currently working on a split of the MobileNetV1 from finn example. I split it for its first part to fit in a Zynq7020 (onnx below). The problem is that I get "size of variable 'srl' is too large to handle" at step "step_out_of_context_synthesis".

Configuration

Vivado 2022.2 linux
Finn running on ubuntu on a WSL
Targeting a Zynq 7020

Onnx model

https://file.io/DivU9fhs8KuU

Details and steps

To get to the model above, I ran the steps from the finn examples scripts (same way as the readme) :

step_mobilenet_streamline,
step_mobilenet_lower_convs,
step_mobilenet_convert_to_hw_layers_separate_th,
"step_create_dataflow_partition",
"step_specialize_layers",
"step_apply_folding_config",
"step_minimize_bit_width",

I then split the model to fit on a Zynq7020 (not a Pynq but similar).

Then ran it to a script containing only those build steps :

    "step_generate_estimate_reports",
    "step_hw_codegen",
    "step_hw_ipgen",
    "step_set_fifo_depths",
    "step_create_stitched_ip",
    "step_measure_rtlsim_performance",
    "step_out_of_context_synthesis",
    "step_synthesize_bitfile",

Everything runs fine until step_out_of_context_synthesis, where the script seems to stop, I investigated the generated vivado project for this step and go this report :

vivado.log

In short the error is :
size of variable 'srl' is too large to handle; the size of the variable is 1204216, the limit is 1000000 [/synth_out_of_context_6dkwtifv/results_finn_design_wrapper/Q_srl.v:100]

I think it caused by the input size of the model : 224x224x3 x 8 bits = 1 204 224.

I tried fixing it manually with a tcl command set_param synth.elaboration.rodinMoreOptions "rt::set_parameter var_size_limit 4194304" (source) seemingly fixing it and now I get the error :
ERROR: [Synth 8-403] loop limit (65536) exceeded [/synth_out_of_context_moj0vvpc/results_finn_design_wrapper/Q_srl.v:179].

How can I fix this behavior ? I assume that if the example is working on the alveo, vivado should be able to synthesize loop and array this big.

Thanks you for your time,
Mat

The text was updated successfully, but these errors were encountered:

fpjentzsch · 2025-01-10T16:33:49Z

Hi,
seems like you have a very deep FIFO in your accelerator. Maybe you can optimize your folding or FIFO sizing to avoid this in the first place. It should not be necessary to buffer the whole input frame for a MobileNetV1.

Otherwise you could try the builder option split_large_fifos.

sn0wst0rm · 2025-01-11T01:51:10Z

@fpjentzsch
Hello,
I had a related problem during set fifo depth with auto fifos to search for the best fit. I have a StreamingMaxPool that process an input image of 416x416x8, and the estimated cycles are ~215000. When performing the simulation to analyze the fifos, the build failed because the period of the simulation was set lower than the actual time it takes for the layer to simulate, which was said to be at least the double (over 500000 cycles).
To proceed I had to manually patch the minimum period, and I think this is something known because I found in your code for the StreamingMaxPool custom op clock cycle estimation method a todo saying the calculation was wrong.

Now, I get that it could be a bit wrong, but in this case was halving the performance.

I noticed that the StreamingMaxPool_Precision implemented in hls first process all the image, pixel by pixel, and then outputs it, not supporting any SIMD (because you are already proceeding all the channels in parallel) nor PE.

I found an old PR, #789 made by you that was talking about also parallelizing the pixels (mmv). Did you ever accomplish this in any way? The code in that PR is pretty old and I don’t even think it can merge, but it would be nice to have such option since for the first layer (typically bigger, like for my yolo) we are basically limited by this slow computation.

If you have any tip or advice on how to improve the performance of this layer, they would be really welcome.

Thank you!

kalahel · 2025-01-13T15:19:51Z

Hi,

Thanks for your responses.
@fpjentzsch setting auto_fifo_depths=False did solve the issue, thank you a lot !

I can close the thread now or leave it open if you want to answer to @sn0wst0rm.

Best regards,
Mat

sn0wst0rm · 2025-01-13T16:28:12Z

@kalahel if you don't mind I'd wait a bit more for an answer from @fpjentzsch.

Thanks!

kalahel added the bug Something isn't working label Jan 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MobileNetV1 - "size of variable 'srl' is too large to handle" from Q_srl.v at step "step_out_of_context_synthesis" #1260

MobileNetV1 - "size of variable 'srl' is too large to handle" from Q_srl.v at step "step_out_of_context_synthesis" #1260

kalahel commented Jan 10, 2025

fpjentzsch commented Jan 10, 2025

sn0wst0rm commented Jan 11, 2025

kalahel commented Jan 13, 2025

sn0wst0rm commented Jan 13, 2025

MobileNetV1 - "size of variable 'srl' is too large to handle" from Q_srl.v at step "step_out_of_context_synthesis" #1260

MobileNetV1 - "size of variable 'srl' is too large to handle" from Q_srl.v at step "step_out_of_context_synthesis" #1260

Comments

kalahel commented Jan 10, 2025

Prerequisites

Quick summary

Configuration

Onnx model

Details and steps

fpjentzsch commented Jan 10, 2025

sn0wst0rm commented Jan 11, 2025

kalahel commented Jan 13, 2025

sn0wst0rm commented Jan 13, 2025