— How do you differ from countless news aggregators on the market?
First, we don’t see ourselves as an aggregator. We collect the news from vetted sources, and our proprietary algorithm returns a score that is used to rank the stories. The goal is to deliver the best journalism available the previous week on a specific subject. In the case of the free Deepnews Digest launched in June, the topic is based on what’s in the news, and we can decide on the spot to produce a special edition as we did with the coronavirus outbreak. Same concept for the paid-for verticals we launched two weeks ago — the Distills — for which we spotlight the 25 most valuable stories on a given subject. Second, we don’t see the point of showing the reader three or four stories featuring the same elements of information, packaged in slightly different ways. Our goal is to provide a product that will save time, not distract the reader with a clutter of redundant stuff.
— Same questions vis-à-vis specialized newsletters, what’s different about you?
The biggest differences are scalability and automation in the editing process. A pure editorial newsletter — i.e. reporters assigned to a sector and producing original content — scales only as its newsroom grows, at a significant cost. Deepnews’ system is different. We pick a subject, find the sources, vet them, set up the crawling system and the CMS and launch. As for the production, our newsletter system is designed not require more than an hour of editing time per issue, checking for algorithm mistakes, false positive, duplicates, and to write the editor’s pick boxes. We let the algorithm do most of the work, and frankly, it is not possible for even the best newsletter writers to read the number of articles in all the different places that the algorithm can. A human writing a preview newsletter on the Oscars is probably not going to read the Northwest Arkansas Democrat-Gazette to find original in-depth reporting, though Deepnews can and did.
— Why do you constrain your system in a closed environment of sources?
Two reasons: first, we want Deepnews Digests and Distills, to be shielded from fake or misleading information, which means vetting the sources. Depending on the perspective, some of them might be deemed as biased one way or another, but they are all reliable sources. Second, we want to dive into the long tail of good journalism and expertise. It would be easy to spotlight the most talked-about stories by relying on social propagation, but this is exactly what we don’t want to do. We want to surface the author who is an expert on something but doesn’t write often and is not good at self-promotion. It means doing proper research on the relevant sources, sometimes with the help of an expert in the field.
— What is the biggest challenge you had to face in developing the product?
The most complicated part is translating our editorial vision into a fine-tuned deep learning model. For instance, finding the right “weight” between sources, avoiding all sorts of biases based on notoriety, length of the piece, the catchiness of the headline, etc. We have already done about 100 versions of our Scoring Model, and we are constantly testing new approaches. Further down the road, we might have to develop various models working in parallel like one focusing on the angle of an article (the most critical editorial decision), and another dedicated to assessing the sources, perhaps the authors, in addition to a pure language analysis, which is at the core of the current system (more about the technology here).