

This board would probably work fine for you:ĭo keep in mind that Xilinx (and probably Altera and Lattice as well) generally make you buy the development software for FPGAs this size. I could probably do it in a smaller part as well. In the world of Xilinx FPGAs, I would probably be able to do that comfortably an Artix-7 100T. This is, of course an naive estimation which does no optimisations and is not considering the internal structure of a FPGA, which could change the result (fanout requirements, singal delay, etc.), but you can use it to check your synthesis result. Combined with your 64512 LUT for equality, you get 65536 LUT. To get a single output, you have to combine the outputs to a single signal: Multiply this with 4096 and you will get 64512 LUT with 3072 (4096*0,75) outputs. for 12 Inputs this needs 3 Luts in the first stage and 3/4 Input in the second stage, resulting in 15,75 LUT per byte sequence. 4 Inputs can be combined by an AND to a single output. This has to be reduced to a single signal. Each lut returns 1 for equality and 0 otherwise. For 3 bytes, you need 12 LUTS in the first stage.

Using a 4 Input LUT, you can compare 2 bits at a time.

For example, an Altera Cyclon II has 4 Input LUT elements, so I try to give an estimate with that. Usually, a FPGA has logic cell structure, where each cell is represented as a Lookup table with multiple inputs. This depends heavily on the type of FPGA you are using.
