Right, a 12 GB model trained on 100,000,000 images isn’t big enough to contain an MD5 checksum of each.
The same people expect it to identify the authorship of sentence fragments, but never quote one whole paragraph from any book ever. Now: gigabytes of text could be a significant fraction of all books. But finding a single recognizable page is news. Storing text is not what these companies spent a bajillion dollars on.
Right, a 12 GB model trained on 100,000,000 images isn’t big enough to contain an MD5 checksum of each.
The same people expect it to identify the authorship of sentence fragments, but never quote one whole paragraph from any book ever. Now: gigabytes of text could be a significant fraction of all books. But finding a single recognizable page is news. Storing text is not what these companies spent a bajillion dollars on.