• 0 Posts
  • 1 Comment
Joined 11 hours ago
cake
Cake day: February 5th, 2026

help-circle
  • Hi every one, maybe I’m a bit late to this, but I wanted to share my findings. I parsed every page up to 40k in DS9 3 times and results matched by distribution with PeoplesElbow findings (no content after page 14k and a lot of dublications) BUT I parsed 4 times more unique urls 246_079 (still 2x short of official size). And a strange thing is that on second pass (one day after the first one) I started receiving new urls on old pages.

    Here is stat by file type:

     count  | file type 
    --------+------
          1 | ts
          8 | mov
        236 | mp4
     244326 | pdf
         73 | m4a
          1 | vob
          1 | docx
          1 | doc
          9 | m4v
       1422 | avi
          1 | wmv