
Nice work man! I also discovered something yesterday that I think is worth pointing out.
DUPLICATE FILES: Within the datasets, there are often emails, doc scans, etc that are duplicate entries. (Im not talking about multi torrent stitching, but actual duplicate documents within the raw dataset.) **These duplicates mustbe preserved. ** When looking at two copies of the same duplicate file, I found that sometimes the redactions are in different places! This can be used to extract more info later down the road.
For anyone looking into doing some OSINT work, this is an epic file EFTA00809187
It contains lists of ALL know JE emails, usernames, websites, social medias, etc from that time