Kristov Atlas Security Advisory < 20140609-0 >
Title: Weak Privacy Guarantees for SharedCoin Mixing Service
Product: Blockchain.info SharedCoin mixing service
Privacy Impact: High (7.0 of 10.0)
Research By: Kristov Atlas
Version of this Document: 2
The SharedCoin mixing service provided by Blockchain.info offers only limited privacy to users due to weaknesses in its design. Bitcoin users should carefully consider their privacy requirements and evaluate other mixing services if they require serious privacy guarantees. A tool for analyzing SharedCoin and other CoinJoin-based mixing protocols will be released approximately two weeks following this advisory to allow SharedCoin users adequate time to protect their privacy.
Vendor & Product Description:
Blockchain.info provides Bitcoin users a free wallet service, available as a web wallet and Android app. “Blockchain.info is Bitcoin’s most popular bitcoin wallet and block explorer. As of January 2014 the site has over 1.1 million registered users and 200 million page views per month.”  Users can elect either to send bitcoins through standard Bitcoin transactions, or more privately using Blockchain.info’s SharedCoin service. SharedCoin is modeled after the CoinJoin privacy protocol designed to mix bitcoins from multiple users. When mixing their coins, customers select the number of times to repeat the SharedCoin process, between two and ten. Blockchain.info describes their service as follows: “Shared Coin is a method of making transactions which requires less trust in the service. Shared Coin is based on the CoinJoin concept which acts as a meeting point for multiple people to join together in a single transaction. Having multiple people in a transaction improves privacy by making transactions more difficult to analyze. The important distinction between traditional mixing services is the server cannot confiscate or steal your coins.” 
SharedCoin is purportedly an open source project, with a source code repository available at:
Blockchain.info is owned by Blockchain Luxembourg S.A.R.L. The CEO of Blockchain.info is Nicolas Cary .
In a sample of 20,000 consecutive transactions across 45 blocks in the Bitcoin blockchain, 2.6% of the transactions (529) fit the profile of SharedCoin transactions†. This small sample constitutes only 7 hours of Bitcoin transactions from March 27, 2014.
†: Three criteria were used to identify a given transaction as a potential SharedCoin transaction: The transaction contained more than four inputs, the transaction contained more than four outputs, and the transaction was relayed internally by Blockchain.info.
Pending further research into this area, the SharedCoin service should be used only as a light protective measure for financial privacy. SharedCoin will be suitable for protection against unskilled examiners of the Bitcoin blockchain until user-friendly analysis tools are released to the public. Users who require the aid of mixing services to protect themselves from agents with intermediate programming skills should carefully evaluate other services and technologies to meet their financial privacy requirements.
It is currently unclear whether adding additional rounds of SharedCoin will positively or negatively affect privacy. Until changes are made to SharedCoin to address the weaknesses identified in this advisory, users who continue to utilize SharedCoin are most likely better off using the maximum number of rounds permitted (10) in order to increase the amount of computation required by adversaries to analyze SharedCoin activity across multiple transactions.
Bitcoin is a decentralized currency based on a data structure called a “blockchain.” The Bitcoin blockchain serves as a public ledger of all transactions in the network, allowing participants in the network to come to a global consensus about the balance of Bitcoin addresses and the fidelity of Bitcoin transactions submitted to the network by users. Users on the network adopt pseudonyms in the form of Bitcoin addresses, typically represented as strings of letters and numbers. Despite the adoption of pseudonyms, the openness of the blockchain permits both targeted and broad analysis of Bitcoin users. The original design of the protocol encourages users to avoid privacy pitfalls by generating new addresses for each new transaction, and by avoiding the pooling of funds from multiple addresses whenever possible. This is generally difficult for many users for a variety of reasons including lack of education, lack of Bitcoin client features, and poor client interface design.
Bitcoin developers and entrepreneurs have proposed a number of improvements at the client level to help mitigate some of these privacy risks. Early on in the Bitcoin ecosystem, entrepreneurs launched third-party mixing services that are capable of breaking down the path of bitcoins on the blockchain through off-chain accounting. However, users of these third-party mixing services must trust the services not to steal — or fail to protect from theft — customer funds. To address the weaknesses of third-party mixing services, Bitcoin developers proposed a set of client-side augmentations that would permit peer-to-peer mixing. Developer Gregory Maxwell formalized one of those proposals as a client protocol in a post to the BitcoinTalk forum on August 22, 2013  and named it CoinJoin. CoinJoin takes advantage of the fact that a Bitcoin transaction can have many input and output reservoirs of funds. It combines the funds of multiple Bitcoin users together in a single transaction without the need for any third party service to act as a temporary custodian. CoinJoin users synchronously take turns signing a CoinJoin transaction until all involved parties agree on the final state of the transaction, before finally committing it to the blockchain. The end result is that the users are able to mix their funds together without any opportunity for theft to occur. Some party must act as conductor to orchestrate this process between CoinJoin users.
A Bitcoin client named SX was one of the first clients to implement the CoinJoin protocol. SX took a simple approach to implementing CoinJoin, allowing users to gather in a chat room, make use of a CoinJoin server to orchestrate the protocol, and mix their coins. User could only use this CoinJoin feature when they were willing to each contribute and receive the exact same amount of funds. Here is an example of one of SX’s CoinJoin transactions from the blockchain:
Transaction hash: 9256332b9ca52cbcb06f57296dfd982d8da3f7d4696b4c10cf9bb93dae6edf58
Date: 2013-08-27 06:55:31
Input #1: 0.0101 BTC from address 1MzpcbFKrR4g9wyu1wBKGApyktsf4wteDe
Input #2: 0.0101 BTC from address 18oXQcw6gyGbTq3VrKAJQN2zZ2giTbrUu8
Input #3: 0.0101 BTC from address 17PU5P2Qy3bEyN9PBwEBNbfA6jNmbDhTYd
Output #1: 0.01 BTC to address 1Fufjpf9RM2aQsGedhSpbSCGRHrmLMJ7yY
Output #2: 0.01 BTC to address 1Evy47MqD82HGx6n1KHkHwBgCwbsbQQT8m
Output #3: 0.01 BTC to address 18f5VuUgFspeHL5d7Lh8xboLLu7dCgiDR9
In this transaction, three different users (or one user using three instances of the SX client, perhaps) provided 0.0101 BTC as input to the transaction. Three addresses received 0.01 BTC at the end of the transaction. The remaining 0.0003 BTC was spent as a miner’s fee to have the transaction quickly confirmed by the network. Because all of the input amounts are identical (0.0101 BTC) and all of the output amounts are identical (0.01 BTC), this is a perfect CoinJoin specimen in that the amounts leak no information about which of the 0.01 BTC outputs correspond to which of the 0.0101 BTC inputs. The chance of co-ownership of each of them is one in three.
This approach is significantly restricted in that all participants must be willing to input and receive identical amounts of funds. If users wanted privacy from other CoinJoin users, they could be left waiting a long time for other users to partner up with who were seeking to send identical amounts of money. Since this approach does not permit for change addresses, all of the participants must in fact have equivalent balances in their input addresses before participating in the CoinJoin operation.
Many wallet services have been eager to incorporate CoinJoin into their services to improve users’ privacy. However, copying SX’s approach would have proved impractical for web-based or mobile wallet services. Requiring aspiring CoinJoin participants to wait a long time for partners would prove both confusing and frustrating for many users in that demographic. In consideration of these limitations, Blockchain.info programmed and released their own take on CoinJoin branded “SharedCoin” as part of their wallet service. It’s unclear whether this rebranding was committed for the purpose of promotion, or whether to distance themselves from would-be critics who might point out critical differences between SharedCoin and the original CoinJoin proposal. SharedCoin was released some time between September and October 2013, having been posted to the popular Bitcoin sub-reddit by user “HostFat” on October 19, 2013 . Source code for SharedCoin was first posted publicly to its GitHub repository on December 22, 2013 . Following Coin Validation’s plans to intentionally subvert Bitcoin privacy en masse, Blockchain.info announced November 17, 2013 that SharedCoin would be made free for use .
Assuming that a web wallet service does not want users to wait a long time for partners, they must make some adjustments from the SX baseline. Gregory Maxwell’s CoinJoin proposal was actually more relaxed than SX’s approach: “N users would agree on a uniform output size and provide inputs amounting to at least that size. The transaction would have N outputs of that size and potentially N more change outputs if some of the users provided input in excess of the target.” Still, this would mean that users could only CoinJoin together if they were interested in receiving the same output amounts as each other participant. Unless a wallet service had a large volume of users and transactions to match up, this could lead to long wait times.
In order to study the behavior of SharedCoin, I performed a number of SharedCoin transactions using my own bitcoins and observed their path through the series of SharedCoin transactions. Out of legal risk considerations, I have decided not to include the details of those transactions in this advisory, but I can compare the results to other transactions I identified within the blockchain that were consistent in nature with the SharedCoin transactions that I participated in.
The SharedCoin transactions I participated in had the following common characteristics:
- They were relayed internally by a Blockchain.info IP address.
- They contained 9 or more transaction inputs.
- They included 9 or more transaction outputs.
- The number of inputs and outputs was often different.
- They always include a mining fee, and this fee is always a multiple of some constant (currently 0.0001 BTC).
Some SharedCoin transactions have drastically different numbers of inputs and outputs, suggesting that either a large number of change addresses may be included as outputs, or that SharedCoin attempts to obscure the path of customer funds by splitting and joining them into a random number of addresses, e.g. an input of 1 BTC could be split into two outputs of 0.3 BTC and 0.7 BTC for a single round of SharedCoin.
Bitcoin privacy analysis is difficult because there are currently few publicly available tools to analyze the blockchain. The average user may incorrectly assume that SharedCoin privacy can be measured by the number of inputs and outputs, or in terms of the classic Taint analysis provided by Blockchain.info.  To test the validity of these intuitions, I created a tool that examines all possible combinations of inputs and outputs to determine if a relationship can be deduced in spite of SharedCoin’s attempts to prevent this. This tool is named CoinJoin Sudoku.
To illustrate the success of the tool to date, let us consider a sample SharedCoin transaction. This transaction was selected from a sample of 20,000 Bitcoin transactions as a likely SharedCoin transaction based on its profile.
Transaction hash: 0e0337bdf930eba3b082fdfbd30944b18e03f0f810ae531443161f897a4d3db0
Date: 2014-03-27 16:51:48
CoinJoin Sudoku tries to identify individual participants in CoinJoin transactions by searching for common ownership of inputs and outputs. The tool considers all of the possible ways to group inputs and outputs, and eliminates the possibilities that include groups that do not add up between inputs and outputs, since they do not demonstrate common ownership. For example, a grouping that includes 2 BTC and 3 BTC as inputs and 1 BTC and 4 BTC as outputs would be a valid grouping since both add up to 5 BTC; inputs of 2 BTC and 3 BTC and outputs of 1 BTC and 2 BTC would not a be a valid grouping, since they do not add up to the same number.
Figures 1 & 2: Valid and invalid groupings of inputs/outputs when analyzing SharedCoin
For the sake of speed efficiency, the tool currently processes a transaction by examining one digit at a time in the inputs and outputs, working its way from right to left; this is faster because transactions typically involve inputs and outputs with many zeros, which can be ignored while processing a given digit.
In order to complete the processing of this transaction in a reasonable period of time using only one processor, I instructed the tool to skip processing the 3rd digit in each number (e.g. the “2” in “0.026939 BTC” for the first input). The tool is currently so inefficient that it took 30.75 hours to complete the processing on a single 2.3 GHz processor core. Removing the restriction would allow the tool to much more thoroughly de-anonymize the transaction, but would require substantially longer without further efficiency improvements. Despite the limitation, the tool was able to group 69% of inputs and 53% of the transaction’s outputs.
Figures 3 & 4: Relationship analysis of inputs and outputs in the transaction, before and after CoinJoin Sudoku is run.
A colored input or output in the diagram above indicates that the tool was able to establish common ownership of that Bitcoin address with 100% certainty. In this case, two SharedCoin participants were identified, denoted by red and blue. Common ownership could only be established with less than 100% certainty for the addresses with no color assigned above. The two participants identified represent a ceiling for the number of participants who own those addresses, since it could in fact be just one participant responsible for all of those addresses. We can understand this with a simple analogy: if you see two bicycles parked next to each other, you know they’ll have at most two owners, but it could be just one person who owns both bicycles.
The results of the tool clearly indicate that the anonymity set for participants is much smaller than the number of inputs and outputs. The results also indicate that taint % analysis is a poor measurement for measuring privacy of CoinJoin transactions. For example, consider the first two inputs of the example SharedCoin transaction for addresses “19pe…” and “1zXV…”, and the output for address “1F3y…”. Viewing the taint analysis for the output “1F3y…” provided by Blockchain.info, the output is tainted 4.2% by input “19pe…” and 4.6% by input “1zXV…”, but CoinJoin Sudoku reveals that input “19pe…” is related to the output with 100% certainty, and input “1zXV…” has only a 50% likelihood of a relationship with the output (See Fig. 5).
Figure 5: Likelihood of relationship between output and two inputs:
In approximately two weeks, CoinJoin Sudoku will be released, open-source. The delay is intended to allow former users of SharedCoin adequate time to take extra steps to protect their financial privacy. The code will be made available here:
CoinJoin Sudoku should be considered an early step in researching the efficacy of CoinJoin-based privacy services such as SharedCoin. There is room for extensive optimization of the tool to improve its running time, including rewriting the code in a speedier, compiled language — currently written in the official language of Mt Gox, PHP — applying heuristic algorithms to improve on brute force guessing, storing the results of previous calculations in a database in order to avoid reproduction of work, modifying the tool to be asynchronously executed on multiple processors, and adapting the search algorithms to explicitly use matrix multiplication that can be executed much more efficiently by GPUs.
Aside from efficiency improvements, this tool can be built upon to analyze across multiple transactions. This would give us a much clearer picture of whether adding additional rounds of SharedCoin improves the privacy of users, and may also make the tool more useful to analyze other CoinJoin-based privacy technologies, such as Dark Wallet and Darkcoin’s DarkSend.
Vendor contact timeline:
2014-05-26: Contacted vendor’s customer support personnel, delivered summary of findings and expected disclosure timeline.
2014-05-26: Vendor confirmed that summary of findings has been forwarded to security personnel.
2014-06-09: Advisory released.
2014-06-23: (Anticipated) Release of proof-of-concept tool, CoinJoin Sudoku.
2015-01-08: The vendor claimed to have fixed this issue (not yet confirmed) in this GitHub comment .
At the time of releasing this advisory, the vendor has not released any details about planned improvements to the SharedCoin service. In January 2015, the company contacted me to claim that they have fixed the issues identified in this advisory, though this has not yet been confirmed. Because the blockchain is a permanent and public record of all transactions, weak privacy obtained from previous uses of the SharedCoin service (or any mixing service) cannot be undone. It is possible for Blockchain.info to improve SharedCoin for future transactions; this may require tighter restrictions on which customers Blockchain.info permits to participate in the same SharedCoin transactions.
Author Contact Information:
Kristov Atlas, Security & Privacy Researcher
: https://blockchain.info/taint/1548xyqG8SFFRtFeRzMYCFHY1UnpaT8pdG (sample taint analysis for a random Bitcoin address)
K.A. / (c) 2014