In recent years, it is highly demanding to deploy Deep Neural Networks (DNNs) on edge devices, such as mobile phones, drones, robotics, and wearable devices, to process visual data collected by the cameras embedded in these systems. In addition to the model inference, training DNNs locally can benefit model customization and data privacy protection. Since many edge systems are powered by batteries or have limited energy budgets, Field-Programmable Gate Array (FPGA) is commonly used as the primary processing engine to satisfy both demands in performance and energy-efficiency. Although many recent research papers have been published on the topic of DNN inference with FPGAs, training a DNN with FPGAs has not been well exploited by the community. This paper summarizes the current status of adopting FPGA for DNN computation and identifies the main challenges in deploying DNN training on FPGAs. Moreover, a performance metric and evaluation workflow are proposed to compare the FPGA-based systems for DNN training in terms of (1) usage of on-chip resources, (2) training efficiency, (3) energy efficiency, and (4) model performance for specific computer vision tasks.