As the human population continues to grow, our food production system is challenge. With tomato as the main fruit produced indoors, the selection of varieties adapted to specific conditions and higher yields is an imperative task if we are to meet the growing food demand. To assist growers and researchers in the task of phenotyping, we present a study case of the Agroscope phenotyping tool (ASPEN) in tomato. We show that when using the ASPEN pipeline, it is possible to obtain real-time in situ yield estimation without a previous calibration. To discuss our results, we analyse the two main steps of the pipeline in a desktop computer: object detection and tracking, and yield prediction. Thanks to the use of YOLOv5, we obtain a mean average precision for all categories of 0.85, which together with the best multiple object tracking (MOT) tested allow obtaining a correlation value of 0.97 compared to the real number of tomatoes harvested and a correlation of 0.91 when considering the yield thanks to the use of a SLAM algorithm. In addition, the ASPEN pipeline demonstrated to be able of predicting subsequent harvests. Our results demonstrate in situ and real-time size and quality estimation per fruit, which could be beneficial for multiple users. To increase the accessibility and use of new technologies, we make publicly available the necessary hardware material and software to reproduce this pipeline, which includes a dataset of more than 820 relabelled images for the tomato object detection task and the trained weights.