Abstract

In this paper we present our prototype for the web page segmentation called Block-o-matic (BoM) and its counterpart Manual-design of Blocks (MoB), for manual segmentation. The main idea is to evaluate the correctness of the segmentation algorithm. Build a ground truth database for evaluation can take days or months depending on the collection size, however we address our solution with our manual segmentation tool intended to minimize the time of ground truth generation. Both tools implements the same rules for segmentation, for the manual version allows to propose candidates blocks to assessor and for the automatic the block selection. We present our demonstration scenario with a collection of web pages organized in categories. After its annotation they are compared with the automatic segmentation version and it is given a score and a visual comparison.