Anonymous peer to peer download inside browsers and distributed database inside browsers, Torrent like but untrackable

Ldfa · le 30 décembre 2019

Introduction

Download files from the internet or peer to peer from browsers anonymously. You don't need to install anything since the entry point to the anonymizer network is a javascript application inside your browser, files are stored and shared inside the browsers too (imagine how this is powerfull, 1.5 Billions browsers currently in the world). Check out the presentation and the live demo:

[embedded content]

The below image is often used to describe the problem:

It's still valid today: you can not easily exchange information without using a third party that might do something with it without your permission.

Other projects of the same type are using new WebRTC (Web Real Time Communication) technology, the problem of WebRTC is that the first thing you are doing is to advertise the world how to reach you, check out the case study below.

That's why me made Peersm, it does allow to exchange anonymously information between peers or from the internet directly from your browser, the information is then distributed inside the browsers storage and can be shared between peers, a bit like Torrents except that nobody knows what you have and what you are doing.

Where does Peersm name come from and what's the status of the project?

It's difficult to find a free domain name including 'peer' or 'peers', then we thought to 'Peersm' because it was available and if you change the place of the r you get its contrary: 'Prism', the NSA program.

In addition, the below geometrical figure is an antiprism polyhedron, this symbolizes the Peersm network inside the Tor network, defeating Prism or Prism like programs.

The first release is now live, the anonymizer network used is the well known Tor network, see more details below.

ORDB1 and ORDB2 are live here ORDB1 and here ORDB2, these are our own Tor nodes implementing the ORDB protocol and the normal Tor protocol.

For the first release two Tor Bridges are used to access the Tor network via WebSockets: tor1.bamsoftware.com:9901 and 213.246.63.20:8002

Transparency

The application is a javascript one, so the code is visible inside your browser or here node-Tor, therefore it's quite easy to check what the application is doing, moreover if the debug mode is activated you see all the details where the application is connecting to and what nodes are used to substitute your real IP address.

The "ORDB" below is one of our routers that has been inserted into the anonymizer network. It is relaying data between peers without knowing to whom it is talking too (so Alice and Bob here) and what it is relaying. It is informed by peers of what they have using references called 'hash_names'.

Alice and Bob do connect to the ORDB through the anonymizer network directly from their browser using WebSockets to connect to the first node of the anonymizer network, all links are encrypted.

The browsers are talking the Tor protocol, so there is not any interface of us in the middle to access the Tor network, the browsers are accessing the Tor network like an usual Tor node.

Bob has some files in his browser's storage, Alice is downloading one of it.

Alice wishes to download the lion video from http://www.lion.com, the system detects that nobody has this file so starts a direct anonymous download to the site www.lion.com

Bob wishes to download the lion video too using http://www.lion.com. The systems detects that someone (Alice but it does not know that it's Alice) has it and download the file from Alice's browser. Next peer will be able to get the file from Alice or Bob.

Bob now uploads personal photos inside his browser storage. He decides to encrypt it (that's an option, it's not mandatory) and get the encryption key from the application.

Bob gives to Alice the references that she must use to download the photos: a hash_name 'abcdefg', the decryption key '1234' and the extension of the file '.zip'

Alice enters the hash_name 'abcdefg', the system downloads the file from Bob's browser, Alice decrypts it with the key she knows and rename it with the right extension, she can now open, save or share with others the photos.

Subscription and security

To access the service, you must use an url like: http://peersm.com/your_reference.

There will be a public free of charge limited version and a private one where you will have to pay a small "participation to expenses" fee (Why? Because we need to scale Peersm ORDB servers and manage the references/keys), you can choose to use it one day, one week, one month, etc. For the private version, we just know your email which is used to send you your reference + your key, that's like your password or PIN code you must never send your reference or your key to anybody.

The key is known by you and Peersm site it is used to secure what you are loading and make sure the application is Peersm's one and not a modified one.

This is really very difficult to break but not impossible so an additional mechanism will be added to be able to check the hash of the application with third parties (see Subresources integrity if you want to know more about the state of the art).

Peersm will ask you if you wish to store the key so you don't have to enter it each time, if you accept then you must make sure that nobody else will access your device and retrieve the key.

The key is used too so you can chose when you start the application, so we can not correlate who has loaded the code and who is establishing circuits with the ORDBs.

For some skilled javascript people it's not very difficult to bypass the above mechanism, if you do so then you do it at your own risks since the above mechanism does not secure only the code loading, but the validity of the anonymizer routers too.

Names and references

Hash_names are used to identify a download.

If the file was originally downloaded from the web, hash_name is the hash of the link (public), if the file was uploaded inside the browser, hash_name is a hash provided by the system (private). So the same file can have different hash_name which will split the traffic better than having a unique reference like its hash for all the same files, and which makes more difficult for some potential observers to know what files you have. The hash information of a file does allow to detect if a file was not modified for malicious reasons.

A name is associated to the file, for http://aaa.com/myfile.ext (public) the name is myfile.ext, for private references the name is a random string (in that case you are advised to rename it with a more intuitive name and a correct file extension) or the name of the file that was uploaded inside the browser. For private references you must provide to the receivers the hash_name and the file extension since they can not know the file extension after they have downloaded it, they only know the type of the file but this might not be enough to use it correctly (open, save, etc).

Along with the hash_name comes an encryption key, so data are encrypted using this key and the ORDB does not see the data in the clear.

Since the hash_names are long and not user friendly, of course you must copy/paste them to use them and not do this manually.

If you feel it's not enough you can in addition encrypt files that you have downloaded or uploaded, in that case you will get a private encryption key not known by anybody, the name of the encrypted file will become name.ext.enc, the system will keep reference to the initial type of the file and store the file in indexedDB as a binary file. The sender must provide to receivers the hash_name, the encryption key and the extension of the file. While downloading an encrypted file the receiver will receive the initial type of the file so he can decrypt it and store it with the right format (but nothing else since the ORDB must not know any other information about the files that it is relaying), then he can open/save it using the extension provided by the sender.

The type of a file is used for consistency reasons when storing a file in indexedDB, but senders can change the real type of the file if they don't want the ORDB to be aware of it.

So, basically, people must exchange information to download private and/or encrypted files, by usual means (blogs, emails, sms, internet search, etc).

Some malicious people could associate infected files to a valid hash_name, therefore peers must advertise the hash of the right file so the peers that are downloading this file can check that it is the correct one.

What does Peersm know?

Nothing, the messages flow of Peersm is described by the below image (note: hash information in RELAY_DB_INFO has been removed since the ORDBs do not have to know it):

Inside the ORDB the data are not in the clear but encrypted with the key coming with the hash_name, the ORDB could store what it is relaying (which it is not doing) but this would require a lot of storage and this is at the end of no use (in case of law enforcement or other) since it does not know who own the files and who downloaded them, and it can not trivially know for a monitored file what it is. In addition you can re-encrypt the files so the ORDB has no way to know what the files are about.

So a potential law enforcement action on our servers would not compromise anybody since the ORDBs do not know anything and do not record anything.

Resuming a download, bandwidth and dimensioning

Since the ORDB is relaying the traffic, it can handle a certain number of users depending on its bandwidth. For now there is one ORDB but specifications do allow several ORDBs to share the traffic.

It can happen that a peer closes his browser during a download or that an anonymized connection breaks, Peersm does allow to resume a download (manually or automatically) from any state, whether it's a direct or a peer to peer one, if something unexpected happens you can not lose data since any part of the files received is stored real-time..

Only "seeders" (those that have a complete file) can advertise a complete file they have.

Peersm does monitor permanently the different anonymized circuits, destroy the ones that do not show good performances and restore the ones that were broken.

Which anonymizer network?

With the Tor network. The Tor protocol is known to be robust assuming the routers are correctly chosen in the path, see here for the theory, Peersm is not operating other nodes than the ORDBs.

In case the Tor network is not enough node-Tor does allow to insert independent nodes in the network, normal ones supporting WebSockets and ORDBs, it can be decided to release a package so people can scale the nodes.

Why can you trust Peersm?

Peersm is based on the concept that you can only trust yourself, as explained in the following paragraph any strange thing that could do Peersm is quite easy to detect.

You can trust browsers (assuming you have deactivated some tracking default options) because they are so widely used that any defect, security leak or suspicious behavior is immediately detected by the community, reported publicly and corrected by browser vendors. They cannot afford to ignore it. Same goes for javascript. In addition, javascript code cannot be hidden so what Peersm is doing is more transparent than something that comes with an installation package, even from open source projects. The less outside modules you use, the better as it prevents an attacker from infecting the outside modules (windows for example) and then compromising you when you think you have done your best to secure your application. This is the case with javascript here.

Detractors saying that javascript is insecure are just misinformed and not aware of common knowledge and best practices.

And, again, the ORDBs do not know to whom they are talking to and what they are relaying.

Content Delivery Networks (CDNs)

It is planned for Peersm to be compatible with popular specific CDNs such as MEGA.

How can you help scaling Peersm?

We take care of the ORDBs routers but we can not handle the WebSockets access nodes, because this would break the rule explained above and we would be controlling the entry node and the relay one.

So, you can run a Tor router and compile the websocket-server module, see here.

Then you add in the torrc file:

ExtORPort 8001
ServerTransportPlugin websocket exec /usr/local/bin/websocket-server --port 80

It's advised to use the port 80 for WebSockets so the routers can be reached from any network (university, etc)

Then you give us the information about your router and we include it in the access nodes list.

Technical details

For the complete specifications and technical details, please look at node-Tor/Peersm on git

Case study: could WebRTC be used with Peersm?

This is a plausible proposed Peersm architecture using WebRTC, the advantage would be to use the ORDB only for signaling, not to relay the data.

WebRTC has two modes: a real peer to peer where UDP data are exchanged without using a server, a relay mode where a server is relaying data like Peersm is doing. The reason is that sometimes peer to peer is not possible since you can not be reached from the outside. To make it simple the NAT in the drawing is your ADSL box, or your company network and sometimes it does not allow outside communications to reach you.

The case study here is about the peer to peer mode, let's say that we can traverse your NAT, could your privacy and anonymity be insured with WebRTC?

First issue is that the STUN servers know the public IPs of Alice and Bob.

You can see in the drawing that Bob and Alice are encrypting the information to reach them, so the ORDB does not know it. This implies that Bob has negociated a secret with Alice, this is a second issue, there are no simple means to do this not involving Alice and Bob knowing each other or managing private/public keys for them, and if we do so then we know who owns which key. Alice and Bob could generate dynamically keys and use them to share a secret but the ORDB could easily act as the man in the middle and intercept the communication.

Alice knows Bob's public IP and vice-versa.

Alice and Bob are exchanging data encrypted with the key that they have negociated.

But as stated before we don't know how to negociate simply this key.

To summarize: STUN servers know the IP address of Alice and Bob, Alice and Bob know their IP addresses, Alice and Bob can not negociate an encryption key so they can hide their identity to the ORDB and encrypt their data exchanged outside of the anonymize network.

So unless someone has a better architecture proposal, the conclusion is obvious: you can not use WebRTC if you want to protect your privacy and anonymity.

.

Related projects

Distributing and accessing data inside browsers is of course not a unique idea of us, imagine how this is powerfull given the number of browsers in the world (1.5 Billions), see some other projects using this concept:

Afficher l’article complet

Connexion