Webpage Segmentation repository
Segmentation details
GOSH:blog:20140924151401:5:SEG-2255:GT:chrome
URL
Original:
http://www.tumblr.com/docs/pages
In cache:
http://www.tumblr.com/docs/pages
Dataset code
20140924151401
Algorithm
GT
Browser
chrome
Geometry
880x5001
Category
Google Search - blog
Granularity
5
Word count
1854
Taken
2014-03-17 22:39:16
From
132.227.204.64
Screenshot
BId
Block geometry
Gran.
Label
Elem.
Words
Imp.
Text
Density
Images
x
y
w
h
G2
15.00
0.00
880.00
76.00
0
5
9
1
0.00
G4
40.00
133.00
330.00
174.00
0
1
3
1
0.00
G5
40.00
203.00
855.00
4,790.00
0
319
1438
25
0.00
G3
0.00
4,840.00
820.00
4,947.00
0
28
13
0
0.00
G1
0.00
4,982.00
820.00
5,001.00
0
70
8
0
0.00