This is a short two-hour braindump from checking the weaknesses of the paper Levine et al. (2017) "Statistical Detection of Downloaders in Freenet".
It is neither diplomatic nor aggressive, just unfiltered analysis of the publication with the background of knowing Freenet and being interested in its communication for more than 13 years.
Update 2020-12: The Levine group released yet another report, not peer reviewed, in which they claim to correct their errors. I did not read the details yet, because they note a previous lawsuit as success which was built on the results of a previous publication which was of even worse quality than the one debunked here. Also they claim that their method works against Freenets friend-to-friend mode while admitting that this requires getting a direct connection to a target node — which is just what the friend-to-friend mode protects against. Due to that their statement is so seriously misleading that I would characterize it as academic misconduct; I don’t want to waste any more time on it. Again they did not contact Freenet developers prior to publication. This now got them in the unfavorable situation that integrity would require them to tell at least the first court that their new research proves what Freenet developers warned them against: that the evidence about which they claimed in court that it were robust had so little substance that it essentially was a bunch of lies.
Wrong false positives rate
The core pillar of the detection they name is their claim of a 2.3% false positives rate. But this claim is wrong, because they only reach it through many false assumptions:
They ignore that friend-of-a-friend routing breaks their metric when
- an intermediary node, or
- the observing node has many connections.
which is not the rare case but the normal case.
They assume that they only get a false positive, if a request for a given file reached them with both HTL 18/17 and HTL 16. But the routing algorithm within Freenet is most likely to cause them always to receive requests from a given node over the same route. Therefore Their 2.3% false positives rate contains a mixture of
- the probability of two people requesting the file in the same interval and
- the rate of routing-changes within Freenet (for example because a node on the path went offline). If a request from a given peer is received both from HTL 17 and from HTL 16 then routing changed, otherwise this should not happen.
Their false positives rate when measuring with only one node is therefore meaningless. They would need multiple nodes that all see the request.
In addition their math is wrong:
We construct a model by assuming that each request the downloader makes is sent to exactly one of its peers, and that the selection of that peer is made uniformly at random.
This does not take friend of a friend routing into account. Therefore their math is wrong: It does not match the actual selection of peers, so the results are meaningless for the actual Freenet.
And their model of how measurement works is wrong:
a simple expected fraction of 1/degree for the adjacent and (1/degree)² for the two-hop case.
This does not take the degree of the measuring node into account, therefore it is not a model of routing in Freenet.
Their false positives rate is wrong, their math is wrong, and their model is wrong. Therefore results you get when using their method are false.
Keep in mind, that this is the result of a quick two hour check of the paper. I might also have gotten things wrong here. Also, to somewhat save the grace for the Levine-team: They at least tried to actually measure the false positives rate. They did it wrong and drew false conclusions, and that they tried doesn’t make it right and it doesn’t excuse persecuting people based on their flawed reasoning, but at least they tried.
That the Levine-team did not contact Freenet developers prior to publication is inexcusable, though. It’s like publishing a paper based on evaluations of particle beams from CERN without ever talking to someone from CERN. Can it be so hard to write an email to press -at- freenetproject.org saying “Hi, we found a method to track Freenet downloaders and drafted a paper based on that. Could you have a look to see whether we missed something?”
If you want to know the actual requirements for calculating a false positives rate in Freenet, head over to https://www.freenetproject.org and read the article Statistical results without false positives check are most likely wrong.