The post Bitcoin – ECDSA signature appeared first on Delfr.
]]>Simply stated, a bitcoin transaction is a transfer of spending control between different parties over a pre-specified amount of satoshis. A satoshi is the smallest fraction of a bitcoin and is equivalent to BTC In order to successfuly complete said transfer, the sender must demonstrate that she is the rightful owner of the satoshis she wishes to spend. Such a proof is imperative as it allows the different nodes on the network to reach an agreement regarding the validity of the transaction and as a result, facilitate its inclusion in the blockchain.
At the time of writing, bitcoin’s proof of ownership is encapsulated in a particular type of digital signature known as the Elliptic Curve Digital Signature Algorithm (ECDSA). It is a variant of the Digital Signature Algorithm (DSA) that relies on Elliptic Curve Cryptography (ECC).
In the first section, we introduce the DSA scheme, prove its correctness, and discuss some of its security properties. In particular, we point out that as of the time of writing, and despite its prevalence in various cryptographic settings, we do not know of any valid security proof of DSA in the random oracle (RO) model. However, we highlight that slight variations of it can be proven to be secure.
In the second section, we introduce the ECDSA scheme and prove its correctness. Later on, we present a python-based implementation to further elucidate its building blocks. We also describe how an ECDSA signature gets typically encoded within a bitcoin transaction. Finally, we highlight some of the scheme’s potential shortcomings including the absence to-date of a security proof in the RO model, its susceptibility to being malleable, and its non-linear design that hinders an efficient implementation of multisignature transactions.
In order to better understand the material contained herein, we recommend that the reader familiarizes himself with the necessary prerequisites fleshed out in the following three posts:
The invention of the Digital Signature Algorithm (DSA) is attributed to David W. Kravitz [9] who used to work for the National Security Agency (NSA). The legitimacy of this invention has been contested by Claus Schnorr (the inventor of the Schnorr signature scheme), who asserted that DSA is covered in another patent of his [12]. Readers interested in the claims and counterclaims surrounding the origin of DSA can refer to e.g., [2].
The DSA scheme is built on the finite fields and where and are two large prime numbers of respective bit-lengths and and such that is a divisor of We can think of as the scheme’s security parameter. Let be an element of order in the multiplicative cyclic subgroup One way of finding such a consists in letting for arbitrary such that To see why this construction works, note that:
Similarly to other signature schemes, we define DSA as a set of three algorithms:
be the truncation function mapping strings of arbitrary length to strings of length at most and such that:
proceeds as follows:
finally outputs a signature . The algorithm is modeled as a PPT Turing machine.
is a deterministic algorithm as opposed to probabilistic.
Correctness of DSA The DSA scheme satisifies the correctness property. In other words, any signature generated by will cause the verification algorithm to output True. To see why, let be an appropriate signature on message First note the following chain of implications:
Recalling that order and noting that for some appropriate integer we have:
we get:
Similarly, for some appropriate integer we can write:
Using one more time the fact that order we get:
Upon verification, algorithm computes and concludes that based on the previous equality. This result shows that the output of satisfies the verification algorithm hence demonstrating the correctness of DSA.
Security of DSA: The importance of the randomness of the parameter A necessary condition for the DSA scheme to be secure is for the parameter to be used once per each signature instance. Indeed, if this were not the case, one would be able to derive the private key of the signer. To see why, suppose that and are two signatures generated by the same signer with private key and such that We write:
By design of we have
We then get:
This allows us to solve for the parameter as follows:
Finally, note that mandates that This implies that Consequently, one can retrieve by using either signatures or the hash of the corresponding message or and the common value
Security of DSA: A note on existential unforgeability. A security proof for a digital signature scheme is essentially a proof of resilience against existential forgeries in the adaptive chosen-message setting (EFACM). The rather odd observation is that despite the widespread adoption of DSA, there is no known proof of its security in the RO model. Typically, such proofs rely on a reduction technique that transforms a hypothetically successful forgery attack into a solution to a computational problem believed to be hard (e.g., finding discrete logarithms over certain finite groups).
Before proceeding further, we recommend that the reader familiarizes himself with the content of the following two posts for a better understanding of the logic outlined in this section:
The belief that a DSA security proof may be difficult to construct rests on our inability to date to successfully leverage the reduction model (RM) in that respect. However, it is important to note that the absence of a proof at the time of writing does not imply that a proof does not exist. In what follows, we attempt to argue why a DSA security proof based on RM may be difficult to devise. To do so, we will need to revisit the foundational steps of the model as outlined in the aforementioned posts.
Recall that RM applies a reductio ad absurdum argument that starts by assuming that the signature scheme is not secure i.e., there exists a PPT adversary such that:
where is ‘s random tape, is the random tape of DSA’s signing algorithm (not to be confused with the that appears in ECDSA’s signature), is the random oracle, is DSA’s security parameter and a quantity non-negligible in
Subsequently, the model applies a series of steps that culminate in the extraction of a solution to a problem thought to be computationally hard e.g., finding the private key associated with a given public key . In DSA, one way of solving for the key consists in devising two distinct valid signatures and leading to a linear equation in Conditions C and C below are jointly sufficient for this to be possible:
C
C
To see why, substitute with and write C as Since order this implies that C would then allow us to solve for
In what follows, we derive necessary conditions for C and C to hold. We then argue why applying RM to the DSA scheme does not imply with certainty that these necessary conditions actually hold. This means that we cannot imply with certainty that C and C hold, leading us to conclude that solving for may be difficult after all. We reiterate that we are not arguing that a security proof for DSA is not possible, but rather that such a proof may be difficult to achieve using the reduction technique.
First, since both signatures are valid, the verification equations guarantee that for :
As a result:
C
Consequently, and must exhibit a certain relationship for the first condition to hold. With overwhelming probabilty, two randomly chosen parameters and will not satisfy this equality.
Since C and since for we get:
C C
We also know that for This yields:
The takeaway is that to be able to effectively use the reduction technique to solve for in the case of DSA, one will possibly need to ensure that valid signatures and satisfy the following at a minimum:
We deliberatly used the term “possibly” and the justification is two-fold:
In what follows, we apply the remaining steps of RM and for argument’s sake, we assume that steps two and three hold. In other words, we assume that we are able to:
In the fourth step, one would show that an adversary that can forge a signature, is also capable with non-negligible probability of creating a second forgery distinct from the first one. Most importantly, the two forgery attacks would behave in a similar way up to a certain well-defined event. Formally, one would show that the following quantity is non-negligible in :
When producing its first forgery , the adversary is assumed to make a number of queries to random oracle We denote the query as and its corresponding random oracle reply as (i.e., The second forgery is created through a process known as an “oracle replay attack” and consists of:
The standard forking lemma applied in RM subsequently leads to the following result: Given a successful forgery tuple we can find with non-negligible probability another successful forgery tuple such that:
, but
We let correspond to , and correspond to . At this stage, we highlight two important observations:
Since could be any value in and since order is a prime number equal to the probability of such an event is on the order of which is negligible in .
In light of the above observations, if we let be the index of the query corresponding to we can ensure that while This will satisfy part of the necessary conditions that we discussed earlier for C C to hold. The issue however, is with enforcing the remaining part of the necessary conditions, namely that and be appropriately linked. The reason this is difficult to enforce is because there is no guarantee that gets selected by before is queried to could very well be specified after the query causing the two signatures to have associated parameters and without any particular binding relationship.
Due to this limitation, one cannot conclude with certainty that DSA is necessarily secure in the RO model. However, slight variants of the DSA scheme can be shown to be secure. We mention below two such variants due to Brickell [5] and to Pointecheval and Vaudenay [11]:
Aside from these variants, Fersch et al. [8] devised a security proof for the unmodified version of DSA by introducing an extra modeling constraint. This constraint, also known as the bijective random oracle, applies to the conversion function:
The conversion function is none else than the one that DSA’s signature algorithm uses to calculate with group element as input. The constraint that Fersch et al. impose consists of representing as a composition of three functions such that is a bijection and such that both and are modeled as random oracles. We will not go over the details of their proof, but the interested reader can refer to [8].
The Elliptic Curve Digital Signature Algorithm (ECDSA) is a variant of DSA that uses Elliptic Curve Cryptography (ECC), a topic that we previously introduced in the post on Elliptic Curve Groups. For a given public key length, ECC bestows on ECDSA a significant security advantage over its DSA counterpart. This advantage is a consequence of the observation that the security of cryptographic primitives built on the presumed hardness of the Elliptic Curve Discrete Logarithm Problem (ECDLP) surpasses that of those built on the presumed hardness of the Discrete Logarithm Problem (DLP) on multiplicative cyclic subgroups.
To put this comparative advantage into perspective, we point out that the difficulty of solving ECDLP with 160-bit long public keys is comparable to that of solving DLP on a multiplicative cyclic subgroup with -bit long public keys [3]. In this context, the notion of difficulty refers to the expected amount of time needed to break the discrete logarithm problem.
Being an ECC primitive, ECDSA requires signers and verifiers to agree on the parameters of the elliptic curve to be used. For bitcoin, the curve is secp256k1 whose defining parameters were previously introduced in the Elliptic Curve Groups post. We relist them below for ease of reference:
Bitcoin’s public-key cryptography is hence conducted on the subgroup
We let denote the security parameter associated with ECDSA. In what follows we define this signature scheme as a set of three algorithms:
be the truncation function mapping strings of arbitrary length to strings of length at most and such that:
proceeds as follows:
( times)
finally outputs a signature . The algorithm is modeled as a PPT Turing machine.
is a deterministic algorithm as opposed to probabilistic.
Correctness of ECDSA. The ECDSA scheme satisifies the correctness property. In other words, any signature generated by will cause the verification algorithm to output True. To prove it, we follow a similar logic to the one used to prove DSA’s correctness. More specifically, let be a signature on message and note that:
The verification algorithm will then compute:
The previous equality allows us to conclude that:
hence validating and establishing ECDSA’s correctness.
Illustrative implementation in python. In what follows, we show how the ECDSA signature scheme could be implemented in python. Note that it is always recommended to rely on existing and well-tested implementations. The one below is for educational purposes and we built it from scratch with the sole intention of illustrating the process.
ECDSA relies on elliptic curve point addition and scalar multiplication. We include below five python methods, the first three of which feed into mul_scalar that perfoms elliptic-curve point multiplication. The last method verifies whether a point belongs to a pre-specified elliptic curve or not. The first two methods were sourced from [7].
def extended_euclidean_algorithm(a, b): """ Returns a three-tuple (gcd, x, y) such that a * x + b * y == gcd, where gcd is the greatest common divisor of a and b. This function implements the extended Euclidean algorithm and runs in O(log b) in the worst case. """ s, old_s = 0, 1 t, old_t = 1, 0 r, old_r = b, a while r != 0: quotient = old_r // r old_r, r = r, old_r - quotient * r old_s, s = s, old_s - quotient * s old_t, t = t, old_t - quotient * t return old_r, old_s, old_t
def inverse_of(n, p): """ Returns the multiplicative inverse of n modulo p. This function returns an integer m such that (n * m) % p == 1. """ gcd, x, y = extended_euclidean_algorithm(n, p) assert (n * x + p * y) % p == gcd if gcd != 1: # Either n is 0, or p is not a prime number. raise ValueError( '{} has no multiplicative inverse ' 'modulo {}'.format(n, p)) else: return x % p
def add_points(A, B, p, p1, p2): if (p1 == "O"): return p2 # "0" denotes the identity element of the group elif (p2 == "O"): return p1 else: x_p1, y_p1 = p1 x_p2, y_p2 = p2 if (not ((x_p1 - x_p2) % p) and ((y_p1 - y_p2) % p)): return "O" elif (not ((x_p1 - x_p2) % p) and (not ((y_p1 - y_p2) % p)) and (not (y_p1 % p))): return "O" elif (not ((x_p1 - x_p2) % p) and (not ((y_p1 - y_p2) % p)) and (y_p1 % p)): c = ((3*x_p1**2 + A) * inverse_of(2*y_p1, p)) % p d = (y_p1 - c*x_p1) % p x_p12 = (c**2 - 2*x_p1) % p return (x_p12, (-c*x_p12 - d) % p) elif ((x_p1 - x_p2) % p): c = ((y_p2 - y_p1) * inverse_of(x_p2 - x_p1, p)) % p d = ((x_p2 * y_p1 - x_p1 * y_p2) * inverse_of(x_p2 - x_p1, p)) % p x_p12 = (c**2 - x_p1 - x_p2) % p return (x_p12, (-c*x_p12 - d) % p)
It implementats the double-and-add algorithm previously introduced in the Elliptic Curve Groups post and it relies on the add_points method
def mul_scalar(A, B, p, p1, m): output = "O" while m&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;0: if (m &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;amp; 1): output = add_points(A, B, p, p1, output) m &amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;&amp;amp;amp;amp;amp;amp;amp;amp;amp;amp;gt;= 1 p1 = add_points(A, B, p, p1, p1) return output;
def is_on_ec(A, B, p, p1): if (p1 == "O"): return True x_p1, y_p1 = p1 return ((y_p1 ** 2)%p == (x_p1**3 + A*x_p1 + B)%p)
We also saw that bitcoin’s ECDSA uses the secp256k1 elliptic curve. The following python variables specify the parameters of this curve:
For secp256k1, A_dec and B_dec
p_dec = long(2**256 - 2**32 - 2**9 - 2**8 - 2**7 - 2**6 - 2**4 - 1) G_dec = (55066263022277343669578718895168534326250603453777594175500187360389116729240, 32670510020758816978083085130507043184471273380659243275938904335757337482424) n_dec = 115792089237316195423570985008687907852837564279074904382605163141518161494337 A_dec = 0 B_dec = 7
The ECDSA algorithm requires us to specify an appropriate hashing function. In the case of bitcoin, we use SHA256. In addition, the algorithm makes use of a truncated version of the hash of the message. These two operations are implemented as follows:
def truncate(num,Ln): # Convert to binary format, remove leading 2 characters and # extract leftmost Ln bits num_bin = bin(num)[2: Ln+2] return int(num_bin, 2)
def message_Hash(m): # Transform m into byte format m_byte = m if isinstance(m, bytes) else bytes(m, 'utf-8') # Compute digest in hexadecimal format digest_hex = hashlib.sha256(m_byte).hexdigest() # Compute digest in decimal format digest_int = int('0x' + digest_hex, 16) return digest_int
Finally, we implement the three algorithms of the ECDSA scheme:
def ecdsa_Key_Generate(): # Generate decimal version of private key d which # is a scalar in the field (F_n)* d_flag = False; while (d_flag == False): # Decimal value of random 256-bit scalar d = random.getrandbits(256) # Test if scalar is in the field (F_n)* d_flag = 0 lt; d lt; n_dec # Generate the decimal version of the public key H # associated with the private key d H = mul_scalar(A_dec, B_dec, p_dec, G_dec, d) return (d, H)
def ecdsa_Sign(d, m): # The call to the "truncate" method is not really needed # since in this case, message_Hash corresponds to SHA256 # which is already 256-bit long z = truncate(message_Hash(m),256); r, s = 0, 0; while ((r == 0) or (s == 0)): k_flag = False; while (k_flag == False): # Decimal value of random 256-bit scalar k = random.getrandbits(256) # Test if scalar is in the field (F_n)* k_flag = 0 lt; d lt; n_dec P = mul_scalar(A_dec, B_dec, p_dec, G_dec, k) r = P[0] % n_dec k_inv = inverse_of(k, n_dec); s = (k_inv * ((z + (d*r)) % n_dec)) % n_dec; return (r, s)
def ecdsa_Verify(r,s,H,m): # check if the point H is actually on the curve if (not(is_on_ec(A_dec,B_dec,p_dec,H))): return False # Check if r and s are both elements of F_n* if ((r lt; 1) and (r gt; n_dec - 1) or (s lt; 1) or (s gt; n_dec - 1)): return False; z = truncate(message_Hash(m),256); s_inv = inverse_of(s, n_dec); u = (z * s_inv) % n_dec; v = (r * s_inv) % n_dec; w_1 = mul_scalar(A_dec, B_dec, p_dec, G_dec, u); w_2 = mul_scalar(A_dec, B_dec, p_dec, H, v) W = add_points(A_dec, B_dec, p_dec, w_1, w_2) return (r == (W[0] % n_dec))
(d,H) = ecdsa_Key_Generate(); print "\n--------------------------- ECDSA KEY PAIR GENERATION --------------------------" print "The generated private key is \n--- d = ", d; print "The generated public key is H = (Hx, Hy) where: "; print "--- Hx = :", H[0]; print "--- Hy = :", H[1]; # Sign message print "\n--------------------------- ECDSA MESSAGE SIGNATURE ---------------------------" m = "This is a test message"; print "The signed message is m = '", m, "'"; (r,s) = ecdsa_Sign(d, m); print "The resulting signature tuple (r,s) is given by:"; print "--- r = ", r; print "--- s = ", s; # Verify signature on message print "\n--------------------------- ECDSA SIGNATURE VERIFICATION ---------------------------" ver = ecdsa_Verify(r,s,H,m); print "(r,s) is a ", ver, "signature on m using public key H"; r_modified = r-1; print "r_modified is: ", r_modified; ver = ecdsa_Verify(r_modified,s,H,m); print "(r_modified,s) is a ", ver, "signature on m using public key H";
ECDSA encoding. In bitcoin, an ECDSA signature is not encoded as a simple concatenation of and Instead, it follows the Distinguished Encoding Rules or DER for short. Those rules are formalized in the Abstract Syntax Notation One standard (ASN.1 for short) commonly used to encode arbitrary data objects into a structured binary file [14]. They allow for data compatibility between systems that may use different representations. However, the merit of using it in bitcoin remains unclear.
When is encoded in DER format, we obtain a sequential structure of the form:
where:
Note that the above structure allows us to automatically deduce that:
To see an example of how this encoding is conducted in practice, consider an signature given by its decimal representation:
This translates to a big-endian hexadecimal representation given by:
0x
0x
0x (i.e., 33 in decimal)
0x
0x (i.e., 32 in decimal)
0x
As a result, the DER-encoding of becomes:
Security of ECDSA: The importance of the randomness of the parameter A necessary condition for the ECDSA scheme to be secure is for the parameter to be used once per each signature instance. The same logic applied earlier to DSA demonstrates that if this were not the case, one could easily retrieve the private key of the signer. DSA’s derivation can be replicated by replacing with and with
An example that underscores the importance of is the hacking incident that affected Sony in December 2010. At the time, a group known as “fail0verflow” successfully retrieved the ECDSA private key that signed software for PlayStation3. The reason the hackers were able to do so was because Sony misimplemented ECDSA’s algorithm by forcing a static instead of choosing a random one for every signature.
Security of ECDSA: A note on existential unforgeability. There are no known proof of ECDSA’s security in the RO model. This may be surprising, given ECDSA’s usage in bitcoin. Here too, similarly to DSA, the belief that a security proof may be difficult to construct rests on our inability to date to successfully leverage the reduction model.
One can use the same reasoning outlined earlier for DSA to argue why a security proof for ECDSA based on the reduction model may be difficult to devise. The aforementioned logic can be applied in exactly the same way, save for a few nuances surrounding condition C that we describe next. One way of solving for an ECDSA private key is by constructing two distinct signatures and that lead to a linear equation in the unknown Conditions C’ and C below are jointly sufficient for this to be possible:
C’
C
Writing , condition C’ becomes:
Invoking C along with the fact that order we can compute:
.
In what follows, we derive necessary conditions for C’ and C to hold. Since both signatures are assumed to be valid, the verification equations guarantee that:
, for
As a result:
C’
Consequently, and must exhibit a certain relationship for the first condition to hold. With overwhelming probabilty, two randomly chosen parameters and will not satisfy this equality.
Since C’ and since for we can write:
C’ C
Recalling that for we conclude that:
Similarly to DSA, the takeaway is that to be able to effectively use the reduction technique to solve for in the case of ECDSA, one will possibly need to ensure that valid signatures and satisfy the following at a minimum:
Here too, we purposely used the term “possibly”. The remaining part of the argumentation is exactly the same as the one we previously outlined for DSA. We highlight again that out objective was not to argue that a security proof for ECDSA is not possible, but rather that such a proof may be difficult to achieve using the reduction technique.
Despite the absence to date of a security proof for ECDSA in the RO model, slight variants were shown to be secure:
Aside from these variants, Brown [6] and Fersch et al. [8] devised two different security proofs for the unmodified version of ECDSA by introducing extra modeling constraints.
The conversion function is none else than the one that ECDSA’s algorithm uses to calculate with elliptic curve point as input. The constraint that Fersch et al. impose consists of representing as a composition of three functions such that is a bijection and such that both and are modeled as random oracles. We will not go over the details of their proof, but the interested reader can refer to [8].
Security of ECDSA: Signature’s malleability. Once a signature has been issued on a given message it is reasonable to require that no adversary be able to devise another valid signature on the same message. A signature is said to be malleable if it is not subjected to the aforementioned requirement. As a result, signature malleability could potentially lead to an instance of forgery, albeit in a restrictive sense since the message is taken to be the same. On the other hand, signatures that are simulteneously not malleable and existentially unforgeable (i.e., resilient against EFACM) are referred to as strongly unforgeable [4].
Signature malleability leads to transaction malleability, a notion that we will discuss in a separate post dedicated to bitcoin transactions. For the purpose of our current discussion, it suffices to highlight that a bitcoin transaction is a data structure that encompasses four main categories of information:
Transactions are represented in a serialized byte format that we discuss in more detail in the bitcoin transactions post. The raw serialization is subjected to a double SHA-256 operation that outputs a hexadecimal digest known as the transaction id or txid for short. Any alteration to the body of the transaction, no matter how small, results in a different txid. This is a direct consequence of the expected behavior of a hashing function.
The critical observation is that although a signature is part of the body of the transaction, it is logically infeasible to sign a data structure inclusive of the resulting signature itself. Instead, the signing process is applied to the content of the transaction exclusive of the signature. More specifically, the message that gets signed includes information about the funding UTXOs, the destination addresses and their respective intented amounts.
By definition, a malleable signature scheme could lead to the creation of two valid but different signatures applied to the same transaction. Such an event would cause the bitcoin network to end up with at least two different txids referencing the same content. Such a situation could motivate a specific type of attack known as a malleability attack. The gist of it is as follows:
The above malleability attack can be interpreted as an instance of double-spending, although the malicious party in this case is the receiver and not the sender.
It turns out that ECDSA is malleable. In what follows, we describe three possible avenues to change it without modifying relevant content in the transaction. We highlight that the first two avenues could be exploited by any party including e.g., the recipient of a given transaction. As a result, they are conducive to malleability attacks. On the other hand, the third avenue is specific to the holder of the private key. If the sender is the only holder of the key, one can reasonably assume that no malleability attacks would ensue. We point out that bitcoin has already implemented measures to prevent the first two avenues from being nefariously exploited:
However, bitcoin’s original implementation did not strictly enforce this rule. As a result, one could derive an infinite number of encodings for a given pair. This source of signature malleability has been addressed in Bitcoin Improvement Protocol 66 (BIP 66) [15].
On the other hand, running the verification algorithm on will:
Let We get the following implications:
As a result, we rewrite ‘s verification steps as follows:
This type of signature malleability was supposed to be addressed in BIP62, but had to wait until Pull Request #6769 [1] to be resolved. The mitigation mechanism consisted of requiring that only the signature with the lowest value be valid.
ECDSA multisignatures. So far, our discussion of ECDSA signatures was limited to single signers. It turns out that more elaborate signatures could be constructed. In particular, we could have private keys jointltly sign a transaction in what is commonly known as a multisignature. An important observation is that the implementation of multisignatures in bitcoin consists of creating a separate signature for each private key and then grouping these signatures together. This construct results in at least three disadvantages:
In conclusion, we note that despite the ECC-inherited security features of ECDSA, the signature scheme is not fully immune to drawbacks including:
In light of these shortcomings, a new BIP advocating the adoption of a different signature scheme has been put forth. It turns out that similar to the Schnorr scheme, a variant of it known as Elliptic Curve Schnorr [16] is provably secure and non-malleable in the RO model. Moreover, this variant benefits from the linearity property that allows multiple private key holders to jointly sign a transaction such that the resulting signature is not a naive concatenation of individual signatures, but rather a non-trivial aggregation that reduces to a monosignature. We will discuss multisignatures and explain the advantages of the Elliptic Curve Schnorr variant in a separate post.
[1] Pull request 6769 – Script Verify Low s, 2015.
[2] Anonymous. Rebutal to Schnorr’s patent claims re DSA, August 1998.
[3] Elaine Barker. Recommendation for key management part 1. NIST Special Publication 800-57 Part 1 Revision 4, January 2016.
[4] D. Boneh, E. Shen, and B. Waters. Strongly unforgeable signatures based on computational Diffie-Hellman. PKC LNCS, 3958:229-240, 2006.
[5] Ernest Brickell, David Pointcheval, Serge Vaudenay, and Moti Yung. Design validations for discrete logarithm based signature schemes. Public Key Cryptography. PKC 2000. Lecture Notes in Computer Science, 1751, 2000.
[6] Daniel R.L. Brown. The exact security of ECDSA. Technical Report CORR 2000-34 Certicom Research, 2000.
[7] Andrea Corbellini. Elliptic curve cryptography, a gentle introduction, 2015.
[8] Manuel Fersch, Eike Kiltz, and Bertram Poettering. On the provable security of (EC)DSA signatures. Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security, pages 1651-1662, October 2016.
[9] David W. Kravitz. Digital Signature Algorithm patent, 1991.
[10] John Malone-Lee and Nigel P. Smart. Modifications of ECDSA. Selected Areas in Cryptography|SAC, 2595:1-12, 2003.
[11] David Pointcheval and Serge Vaudenay. On provable security for digital signature algorithms. 11 1996.
[12] Claus P. Schnorr. Method for identifying subscribers and for generating and verifying electronic signatures in a data exchange system, 1989.
[13] J. Stern, D. Pointcheval, J. Malone-Lee, and N.P. Smart. Flaws in applying proof methodologies to signature schemes. CRYPTO, pages 93-110, 2002.
[14] International Telecommunication Union. Itu-t x.690, July 2002.
[15] Pieter Wuille. BIP66 – Strict DER signatures, 2015.
[16] Pieter Wuille. Proposed BIP for 64-byte Elliptic Curve Schnorr signatures, July 2018.
The post Bitcoin – ECDSA signature appeared first on Delfr.
]]>The post Blockchain – Fork analysis appeared first on Delfr.
]]>In this post we analyze two sources of divergence on the blockchain caused respectively by a natural fork and a malicious fork. The revolution that has been brought about by Bitcoin’s blockchain is a direct result of its open nature. Indeed, anyone can be part of it, suggest changes to it, mine new blocks in it, or simply conduct routine validations on it. It is in many respects, the epitome of decentralization and censorship-resistance. Its appealing nature is in large part rooted in its rich interdisciplinary foundation that spans across philosophy, mathematics and economics.
But beyond the elegance of its theoretical underpinning, the blockchain’s seamless implementation rests on an inherent agreement between its different participants. Without agreement, this harmonious apparatus would likely decay into chaos. The rather flawless operation of the system is the result of a particular consensus protocol known as Proof of Work (or PoW for short).
The consensus is meant to be amongst all of the miners on the network. It stipulates that any miner always extend the chain of blocks with the highest amount of cumulative work. In this context, work is a measure of the expected computational effort that a miner exerts in order to solve a given cryptographic challenge. In essence, the challenge consists in finding a value that makes the computation emit an output with a mandatory minimum number of leading 0’s. The work associated with mining a given block corresponds to the value where is dynamically adjusted to ensure that the network’s average block rate remains constant at 0.00167 blocks / second (i.e., 1 block per 10 minutes). We discuss PoW as well as other consensus protocols in more details in another post.
In an ideal setting where all miners are honest (i.e., abide by the PoW consensus protocol) and where blocks are propagated instantaneously on the network, all the nodes will always have a unified view of the blockchain — barring the extremely unlikely scenario of two distinct miners generating two valid blocks at the exact same time). However, imperfections do exist:
In the first case, miners are bound to momentarily experience diverging views of the blockchain. Even if all miners were honest, a network propagation delay would still cause natural forks to form on the blockchain. This is not desirable because one of the most important tenets of a well-functioning ledger consists in ensuring a unified view of the state of the system at any point in time.
We will show in section 2 that the probability of a natural fork occuring at some point in time on a system incurring an information propagation delay is equal to 1. However, we prove that the probability of sustaining a natural fork over a certain time interval is upper-bounded by a quantity that decays exponentially with the length of the interval. Consequently, any natural fork will collapse within finite time with high likelihood. A blockchain subject to the PoW consensus protocol is hence inclined to rapidly settle any natural fork that emerges.
In the second case, dishonest miners could stage an attack and maliciously attempt to redirect the blockchain to another chain of their liking and that suits their interest. The likelihood of success of such an attack (also known as a 51% attack) depends on the hashing power of the dishonest miner or pool of miners. We will discuss this case in section 3.
We start by defining the following network parameters:
generates blocks within seconds
We also define the following four events:
Note that in what follows, we make the following two assumptions:
Our objective is four-fold:
Objective #1: Natural forks on the blockchain happen with probability 1:
where “No fork is formed on the blockchain at any time in the interval “
To see why, note that if both and were not true, then there would exist at least two distinct miners that generate at least 1 block each in the interval But since it must be that at least two parallel blocks (one from each miner) coexist, hence forming a fork. Indeed, the choice of guarantees that none of the two blocks will have sufficient time to propagate to the other node. We conclude that is a necessary condition for to hold. Observing that and are disjoint, we get:
This is also independent of and so
And since we conclude that
Objective #2: A non-trivial upper bound on the probability of a natural fork occuring at an arbitrary block:
Suppose that all miners share a unified view of the blockchain. At time one of the miners adds block # to the blockchain. Miners that still haven’t received block # (which for all practical matters could take up to could generate their own block and start a fork at # We are interested in computing the probability that such an event occurs, i.e. We define the following:
By letting we can write:
And so letting we can write:
Letting be the ratio of miners that still haven’t received block # by time (this function is equal to in the notation of [1]), we get:
Note that if all miners share the same block rate , then As a result, we would get For small the upper bound can be approximated by as found in [1]. Below is a representation of a block propagation delay as observed on the of August 2018 [2]. In order to compute we make the assumption that the share of miners that still haven’t received the block by a certain time is the same as the corresponding share of generic nodes.
We find that Using blocks/sec, we get
Objective #3: Natural forks collapse with high likelihood in a finite amount of time:
In what follows, we derive a non-trivial upper bound on To do so, we will first find a lower bound on the probability of occurrence of its complement “A fork that was created at time collapses at some point in time in the interval “. An adequate lower bound on would be given by where is an event whose occurence is a sufficient condition for to hold true. Our objective becomes one of finding an appropriate and calculating a lower bound on
In the tree below, we depict the different scenarios that lead to a natural fork formation, starting with a shared view of the blockchain. and are time instances corresponding to block formations by miners and respectively. Recall that denotes the average block propagation time between and Scenarios that lead to a fork formation are one of two types: those that lead to parallel chains of equal length (i.e., Type 1) and those that lead to chains of different lengths (i.e., Type 2):
Forks of type 1: Before time all miners share the same view of the blockchain. At (for some generates block # At (for some generates the following block # such that If we wait for seconds and no miner generates any block in interval then at time there will be more than one view of the blockchain, all of which have the same length. If for the following seconds (), one and only one generates at least one block #, and then for the following seconds no miner generates any block, then by time all forks would collapse. This is depicted in the figure below:
Forks of type 2: Before time all miners share the same view of the blockchain. At (for some generates block # At mines the next block # At (for some generates the following block # Two views of the blockchain will coexist at with one being a longer chain than the other. If we wait for seconds, and no miner generates any block in interval then by all forks would collapse. This is depicted in the figure below:
One consequence is that given a fork that was formed at time , if we ensure that for the next seconds no miner generates any blocks, and that for the subsequent seconds one and only one miner generates at least 1 block, and finally for the subsequent seconds no miner generates any block, then the fork would collapse in the interval This construction ensures a sufficient condition for a fork to collapse within a specific time interval. In what follows, we formalize our approach:
does not create any block in time interval
creates at least one block in time interval does not create any block in time interval
does not create any block in
Recognizing that all the events appearing under the union symbol above are disjoint, and that all the intersections are taken over independent events, we get:
And in particular,
Where denotes the floor function.
The objective function of the previous optimization problem is not smooth due to the presence of the floor function. As a result, finding a closed form analytical solution might prove difficult. However, numerical methods could be employed to find the optimal upper bound. Observe that this upper bound decays exponentialy with and eventually converges to 0 when . We use the value of that corresponds to the average block propagation delay to reach 99% of the network observed over a period of time extending from May 2018 to August 2018 [2].
Below, we include various graphs of this upper bound for different values of and for fixed and
These graphs show that for the pre-defined values of and the probability that a natural fork survives minutes (which on average corresponds to the addition of 2 new blocks on the blockchain) is upper-bounded by 0.25 (or almost 1 in 4 cases). When minutes (i.e., the duration required to add an average of six new blocks on the blockchain), the upper bound goes down to 0.015 (or almost 1 in 67 cases). And when minutes, it goes down to 0.00022 (or almost 1 in 4500 cases).
In the graphs above, we assumed a fixed block rate of 1 block per 10 minutes. This is the value used by Bitcoin. For what it is worth, one could further optimize the upper bound over positive values of For fixed and the tightest upper bound becomes:
In order to solve it, we first find the optimal value of that solves the following optimization problem:
Note that since the exponent appearing in the objective function does not depend on and since the base is a positive quantity, we can solve the following equivalent optimization problem whose objective function is now smooth:
Note that when or when the objective function tends to 1. The objective function turns out to be convex in and we can solve for by setting its first derivative with respect to equal to 0. Doing so, yields:
The tightest upper bound over all positive values of and then satisfies:
For each of the following graph, we let and specify a particular value for For each value of we then calculate the corresponding as outlined above, and plot the graph of
Objective #4: A non-trivial upper bound on the probability of a double-spend transaction coexisting with the legitimate transaction, seconds after the legitimate transaction is added to the blockchain:
The derivations above demonstrate that even if all miners were honest, forks are bound to happen, although they collapse with high likelihood after a finite time of their formation. The existence of natural forks could encourage dishonest customers to engage in double-spending behavior.
To see how, consider a scenario in which a customer uses BTC to purchase a physical product. Let the corresponding transaction be denoted Suppose that at the vendor sees his transaction included in a certain block for the first time on the blockchain. Suppose that the customer issues another transaction (before or after ) destined to himself and that uses the same UTXOs as There is a chance that both and be selected by two different miners and be included in two separate coexisting blocks. We define the following double-spending event:
“ and coexist on the bockchain, seconds after ‘s addition to the blockchain in block “
Note that if no fork is formed at or before block then will not constitute a double-spending risk since would have propagated to all nodes. As a result, a necessary condition for a double-spending risk to exist consists of the union of the following events:
We would like to calculate a non-trivial upper bound on We can write:
We have seen earlier that
And if all miners have the same block rate, then
and
Moreover,
Assuming that we get
Next, note that
at least one block gets generated in interval
As a result,
Assuming we get
In the limit when becomes infinitesimally small and tends to the quantity tends to We can then write
Similar to , we have
And if all miners have the same block rate, then
and
We can then conclude that
The graphs below depict the upper bound on the probability that a double spend transaction coexists with on the blockchain seconds after is added to block for various values of and
These graphs show that if before handing over the goods, the vendor waits for an additional minutes after he sees added to block on the blockchain (which at a block rate of would roughly correspond to two additional blocks on top of ), then for and all miners being honest and sharing the same hash rate, the probability of a double-spend attempt coexisting with at time is upper-bounded by 0.00149.
That means that there is at most a probability of 0.00149 that is still part of a parallel fork. By virtue of being sustained, it could still become part of the longest chain. If this happens, gets thrown back into the mempool and gets validated instead. On the other hand, waiting for an an additional minutes (i.e., roughly 5 additional blocks on top of ), would bring down this probability to 0.000183.
One could also optimize the upper bound not only over but also over Below is a plot showing the optimal value for the case minutes. In this case, the tightest upper bound is 1.33 which is achieved for blocks / sec and sec.
A small note on the value of To the extent of our knowledge, the choice of used in Bitcoin is not the result of a pure mathematical optimization exercise. The larger the value of the higher the probability of a natural fork occuring. Natural forks are not desirable as they possibly pave the way to double-spending attempts. However, this is not the only metric that counts.
Another important consideration has to do with the storage capacity requirement and the rate of growth of such capacity that needs to be maintained at the level of each full node on the network. A higher means faster transaction processing but also faster growth of ever-increasing storage requirement. It is most likely that only a handful of nodes will be able to afford such storage, subsequently leading to a centralization scenario. This stands in sharp contrast with Bitcoin’s fundamental philosophy. As a result of this tradeoff, Satoshi’s choice of is probably a good compromise.
In this section we look at the second type of imperfections alluded to earlier. More specifically, we turn to the possibility that a subset of dishonest miner(s) decide to disregard the PoW consensus protocol and mine on top of a parallel chain different than the one with the highest amount of cumulative work.
This type of behavior has been introduced and analyzed in section 11 of Nakamoto’s seminal paper [3]. The analysis demonstrates that dishonnest miner(s) could possibly generate a parallel chain that overtakes the original honest chain. As a consequence, malicious miner(s) can potentially engage in double spending behavior.
Such a scenario is commonly referred to as a 51% attack, although malicious miner(s) do not necessarily need 51% of the total hashing power of the network to launch a double spending attack. A control of 51% or more of the total hashing power will however guarantee that the attack will be successful. On the other hand, control of less than 51% is not associated with a deterministic state of success, but rather a probabilistic one. The analysis in [3] quantifies the probability of success as a function of said control. In this section, we simply clarify the mathematical foundation of this analysis.
Building on the notation used in section 2, we further define the following quantities:
We also define the following two events:
We can write:
Calculating : Given a network block rate of blocks/sec, the rate associated with honest miners is blocks/sec, and that associated with malicious miner(s) is blocks/sec. As a result, sec. In this interval, the subset of malicious miner(s) generate an average of blocks. As a result, we can model malicious miner(s) block generation over this inteval as a Poisson process with mean blocks. We get:
Calculating : Knowing that in interval the malicious miner(s) generated blocks on the parallel chain, we need to calculate the probability that the malicious miner(s) catch-up to the honest chain and generate a parallel chain that is at least as long (note that technically speaking, the parallel chain should be one block longer than the honest chain for the attack to be successful, but Nakamoto’s analysis considers the case of equal length instead). Clearly, if the probability is 1. When we can model the process as a binomial random walk whereby given that the malicious miner(s)’s parallel chain is blocks shorter than the honest chain:
This problem turns out ot be a slight variant of the Gambler’s Ruin Problem that we introduce next.
In what follows, we calculate the probability that the gambler wins knowing that she started the game with UOC This is the probability that she reaches a fortune of UOC at some point in time knowing that she started off with UOC We denote it by A similar derivation can be found in [5] and [4]. Note that in this version of the game, the gambler cannot play if she does not have a positive amount of capital to start the betting process with. This stands in contrast to the aforementioned situation where malicious miner(s) start off with a block deficit. Later on, we will account for this variation.
We start by defining the following events:
{ “The gambler wins”
{ “The gambler wins the first bet in the series”
{ “The gambler looses the first bet in the series”
And let be a random variable denoting the amount of capital held by the gambler at the beginning of the game.
We can write:
In other terms, we have:
since
Recognizing a telescoping series structure, we write:
And so,
if
if
Noting that and applying the above when we get
if
if
Which then allows us to conclude that
if
if
Putting it altogether, we get
if
if
Note that under this setting, the block deficit may widen without having any lower-bound constraint.In the setting of the Gambler’s Ruin Problem, the gambler cannot play when she is in a deficit and will have to stop as soon as her betting capital reaches 0. As such, the problem must be modified to account for a scenario where the gambler could borrow unlimited credit if need be, to continue playing the game.
An equivalent formulation consists of a gambler having an infinite amount of capital to start with. Formally, we assume the same setting as the original problem. The gambler however, has a deficit of UOC . She then borrows UOC and starts the game with the objective of reaching UOC where . If she wins, she returns UOC to her creditor and keeps UOC so as to break-even. In case she loses, she does not return anything to her creditor.
Clearly, the above setting is unrealistic due to the dearth of extremely benevolent creditors. However, when we are dealing with a deficit of a block nature rather than a monetary one, this setting becomes acceptable. We then let and compute the corresponding probability of success. In our block deficit case, blocks.
We get
if
if
Next, note that if then And so
And if then And so
Finally, if then
As a result, if and if
If malicious miner(s) are not incurring a block deficit, i.e., then
Otherwise, if then noting that the probability of malicious miner(s) finding the next block is equal to and the one corresponding to the honest miners is equal to we get:
if and if
The calculations above allow us to conclude that:
if
and if
In the graphs below we plot the probability of a successful 51% attack as a function of the fraction of the total hashing power controlled by the subset of malicious miner(s). We do so for different values of the block validation height
For example, assuming that malicious miner(s) controlled 15% of the total hashing power of the network (i.e., ), then a block validation height of 6 blocks (i.e., ) is associated with a probability of a successful attack of 0.268%.
[1] Christian Decker and Roger Wattenhofer. Information propagation in the bitcoin network, 2013.
[2] DSN. Block propagation delay history.
[3] Satoshi Nakamoto. Bitcoin: A peer-to-peer electronic cash system. White Paper, 2008.
[4] A. Pinar Ozisik and Brian Neil Levine. An explanation of nakamoto’s analysis of double-spend attacks, online, 2017.
[5] Karl Sigman. Gambler’s ruin problem, online, 2016.
The post Blockchain – Fork analysis appeared first on Delfr.
]]>The post Bitcoin – Private key, Public key, and Addresses appeared first on Delfr.
]]>The objective of this post is to introduce the reader to Bitcoin’s private and public keys, and to the Bitcoin addresses used in Pay to Public Key Hash transactions (P2PKH) and Pay to Script Hash transactions (P2SH).
As was previously introduced in the Elliptic Curve Groups post, the linkage between Bitcoin’s private and public keys is determined by a specific elliptic curve known as secp256k1. Recall that the curve’s parameters are as follows:
denotes the point at infinity and is the identity element of the group. Here is a euclidean representation of this curve when (it is not feasible to show it for ).
which in hexadecimal notation are given by:
79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9 59F2815B 16F81798
483ADA77 26A3C465 5DA4FBFC 0E1108A8 FD17B448 A6855419 9C47D08F FB10D4B8
Bitcoin’s public-key cryptography is hence conducted on the subgroup
which in hexadecimal notation is given by
FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141
We also saw that Bitcoin’s private and public keys obey the following architecture:
( times)
It is a 512-bit long string denoting the elliptic curve point It is an element of the set which in this case is equivalent to Both and are 256-bit long.
The most important observation was that one can efficiently calculate from using e.g., the double-and-add method, but that deriving from is thought to be intractable. We saw that this conclusion is a manifestation of the exponential hardness of the Elliptic Curve Discrete Logarithm Problem (ECDLP).
In what follows we include four python methods, the first three of which feed into the method entitled mul_scalar that perfoms elliptic-curve point multiplication. The first two methods were sourced from [3]:
extended_euclidean_algorithm(a, b): it takes two integers and and returns a three-tuple consisting of gcd and the bezout coefficients and that satisfy gcd (refer to Groups and Finite Fields):
def extended_euclidean_algorithm(a, b): """ Returns a three-tuple (gcd, x, y) such that a * x + b * y == gcd, where gcd is the greatest common divisor of a and b. This function implements the extended Euclidean algorithm and runs in O(log b) in the worst case. """ s, old_s = 0, 1 t, old_t = 1, 0 r, old_r = b, a while r != 0: quotient = old_r // r old_r, r = r, old_r - quotient * r old_s, s = s, old_s - quotient * s old_t, t = t, old_t - quotient * t return old_r, old_s, old_t
inverse_of(n,p): it computes the inverse of mod by relying on the extended_euclidean_algorithm method (refer to Groups and Finite Fields):
def inverse_of(n, p): """ Returns the multiplicative inverse of n modulo p. This function returns an integer m such that (n * m) % p == 1. """ gcd, x, y = extended_euclidean_algorithm(n, p) assert (n * x + p * y) % p == gcd if gcd != 1: # Either n is 0, or p is not a prime number. raise ValueError( '{} has no multiplicative inverse ' 'modulo {}'.format(n, p)) else: return x % p
The rules for adding two points was outlined in the Elliptic Curve Groups post:
''' This method adds 2 points p1 and p2 in the elliptic curve group associated with the elliptic curve equation E: y^2 = x^3 + Ax + B mod(p) ''' def add_points(A, B, p, p1, p2): if (p1 == "O"): return p2 # "0" denotes the identity element of the group elif (p2 == "O"): return p1 else: x_p1, y_p1 = p1 x_p2, y_p2 = p2 if (not ((x_p1 - x_p2) % p) and ((y_p1 - y_p2) % p)): return "O" elif (not ((x_p1 - x_p2) % p) and (not ((y_p1 - y_p2) % p)) and (not (y_p1 % p))): return "O" elif (not ((x_p1 - x_p2) % p) and (not ((y_p1 - y_p2) % p)) and (y_p1 % p)): c = ((3*x_p1**2 + A) * inverse_of(2*y_p1, p)) % p d = (y_p1 - c*x_p1) % p x_p12 = (c**2 - 2*x_p1) % p return (x_p12, (-c*x_p12 - d) % p) elif ((x_p1 - x_p2) % p): c = ((y_p2 - y_p1) * inverse_of(x_p2 - x_p1, p)) % p d = ((x_p2 * y_p1 - x_p1 * y_p2) * inverse_of(x_p2 - x_p1, p)) % p x_p12 = (c**2 - x_p1 - x_p2) % p return (x_p12, (-c*x_p12 - d) % p)
It implements the double-and-add algorithm previously introduced in the Elliptic Curve Groups post and it relies on the add_points method
def mul_scalar(A, B, p, p1, m): output = "O" while m&amp;amp;gt;0: if (m&amp;amp;amp;1): output = add_points(A, B, p, p1, output) m&amp;amp;gt;&amp;amp;gt;=1 # Shift the bit-representation of m by 1 bit to the right p1 = add_points(A, B, p, p1, p1)# and double the point p1 return output;
Private keys – Decimal representation: We start by generating a random private key that is 256 bits long (recall that the private key can be any number between 1 and where is the prime constant denoting the order of the base point ). We include below an example of a python code that does this. But first, we specify the parameters of the elliptic curve group associated with the secp256k1 curve:
p_dec = long(2**256 - 2**32 - 2**9 - 2**8 - 2**7 - 2**6 - 2**4 - 1) G_dec = (55066263022277343669578718895168534326250603453777594175500187360389116729240, 32670510020758816978083085130507043184471273380659243275938904335757337482424) n_dec = 115792089237316195423570985008687907852837564279074904382605163141518161494337 A_dec = 0 B_dec = 7
Next, we generate a random private key in decimal notation that we assign to variable priv_key_dec
''' The private key can be any number between 1 and (n_dec - 1). In what follows, we generate a 256-bit random integer and then test if it is in the specified range. ''' priv_key_flag = False; while (priv_key_flag == False): priv_key_dec = random.getrandbits(256) # Decimal value of random 256-bit scalar priv_key_flag = 0&amp;amp;lt;priv_key_dec&amp;amp;lt;n_dec # Test if scalar is in the field (F_n)* print "\nThe private key in decimal representation (mod n) is: ", priv_key_dec
priv_key_dec
Private keys – Hexadecimal representation: The private key can be represented in numerous ways. All representations must however correspond to the same 256-bit number. Hexadecimal and raw binary formats are reserved for use by software and are not usually shown to end users. In python, the hex() method converts an integer to its hexadecimal representation and outputs a string of the form ‘0x….’ where the ‘0x’ prefix refers to hexadecimal format.
There is one caveat however. It is possibile for the randomly generated private key not to be big enough to fill all of the 256 bits (recall that the private key can be any positive integer less than If this is the case, we would need to add enough leading 0’s to ensure that the final length is 256 bits. The following python method is one way of completing the hexadecimal representation whenever needed:
''' Whenever needed, this method completes the hex representation with enough leading 0's to ensure that the total number of hexadecimal digits (i.e., nibbles) is equal to 64. This corresponds to 32 bytes or equivalently 256 bits. ''' def comp_256bit_hex(hex_str):# hex_str must be a hex string of the form '0x.....' if (hex_str[-1] == "L"):# Get the hex version without "L" (long-type specifier) hex_str = hex_str[2:-1] # Get the hex version without the'0x' prefix else: hex_str = hex_str[2:] l = len(hex_str) if (l&amp;amp;lt;64): return (64-l)*'0' + hex_str# add leading 0's if less than 64 nibbles else: return hex_str# Return a hex string without the leading '0x' prefix
priv_key_hex = comp_256bit_hex(hex(priv_key_dec)) # To ensure hex format is 256-bit long print "The private key in hexadecimal format is: ", priv_key_hex
priv_key_hex
8962 D6F7 92E5 89C1 1C56 740B A30C B832 AF0A E891 A9DA 1D0C 71B4 EF9D 0043 BE2A
Private keys – WIF representation: Another format for representing private keys is the Wallet Import Format or WIF for short. The WIF format is used whenever a private key is imported or exported from one wallet to another. The Quick Response code (QR) of a private key is usually displayed in WIF format. To perform WIF encoding, the following sequential procedure (also known as the base58Check encoding procedure) is implemented:
The steps are self-explanatory, except possibly for the last one. A base 58 encoding is similar in concept to any other base transformation. The alphabet used in this case consists of the following 58 elements:
The rationale for base 58 encoding is explained in the original Bitcoin client source code:
In what follows, we show how to convert a positive integer to its base 58 representation. Let’s take the integer 19,099 as an example:
Here is an example of a python code that applies this procedure to any non-negative integer
''' The base58 alphabet consists of 58 characters given as follows. ''' alphabet = '123456789ABCDEFGHJKLMNPQRSTUVWXYZabcdefghijkmnopqrstuvwxyz' ''' Method to encode integers into base58 format ''' def base58_encode_int(i): # Argument must be an integer output = "" while i: i, r = divmod(i, 58) output = alphabet[r:r+1] + output return output
In what follows, we show how the base58Check encoding can be implemented in python. Note that it is always recommended to rely on existing implementations such as the one used by the Bitcoin client or as part of other libraries developed specifically for python. The one we include below is for educational purposes and we built it from scratch with the sole intention of illustrating the process:
''' The following method can be used to either: 1. Apply the WIF-encoding scheme to private keys (when the 'key_hex' argument is a private key in hex format, and the 'ver_prefix' argument is equal to 0x80. 2. Derive the Pay To Public Key Hash (P2PKH) bitcoin address associated with a given public key in compressed or uncompressed format (when the 'key_hex' argument is a public key in hex format, and the 'ver_prefix' argument is equal to 0x00. ''' def base58Check(key_hex, ver_prefix): # key_hex is a hex string w/o the '0x' prefix # it can be a private key (WIF encoding), a public key (P2PKH address) or a redeem sript # (P2SH address) key_hex_extended = ver_prefix + key_hex# Add the appropriate version prefix first_sha256 = sha256(binascii.unhexlify(key_hex_extended)).hexdigest() second_sha256 = sha256(binascii.unhexlify(first_sha256)).hexdigest() checksum = second_sha256[:8] # First 8 nibbles of doublesha256 output key_final = key_hex_extended + checksum# The final key in hex format ''' --- If the verion prefix is '0x00', then the encoding is being applied to a public key in order to derive the corresponding P2PKH bitcoin address. In this case, the public key is converted from hex to decimal, encoded in base 58 and then prefixed with a '1'. --- If the version prefix is '0x05', then the encoding is being applied to a script in order to derive the corresponding P2SH bitcoin address. In this case, the script is converted from hex to decimal and encoded in base 58. We don't add a leading '1'. --- If the version prefix is '0x80', then the encoding is being applied to a private key in order to convert it to its WIF format. In this case, the private key is converted from hex to decimal and encoded in base 58. We don't add a leading '1'. ''' if (ver_prefix == '00'): return '1' + base58_encode_int(int(key_final, 16)) else: return base58_encode_int(int(key_final, 16))
We can now obtain the WIF-encoded private key as follows:
priv_key_wif = base58Check(priv_key_hex, '80') print "The WIF-encoded private key is: ", priv_key_wif
priv_key_wif
5JrnvbTmxMqYNFVpcyuBq196xLsrTG7yXNCeRRi1DeDibFZFEoA
A private key encoded in WIF format will always start with a 5. To see why, note that the base58Check method creates a 37-byte long string (a byte-long version prefix, a 32-byte-long private key, and a 4-byte-long checksum) that it transforms into decimal notation before feeding into the base58_encode_int method. The version prefix is set to ’80’ in hexadecimal notation. The smallest and largest sequences of 74 nibbles (i.e., 37 bytes) that can be formed with a ’80’ prefix are respectively given by:
8000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 00
80FF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FFFF FF
When these hexadecimal strings get transformed to decimal representation and then fed to base58_encode_int, we respectively obtain:
5HpHagT65TZzG1PH3CSu63k8DbpvD8s5ip4nEB3kEsreAbmahZy
5Km2kuu7vtFDPpxywn4u3NLu8iSdrqhxWT8tUKjeEXs2fDqZ9iN
Due to the nature of the base 58 encoding scheme (which works like any other base), the image of any valid string of 74 nibbles will be confined to this range, and hence is bound to start with a 5.
Private keys – WIF-compressed representation: Recall that the public key is a point on the elliptic curve defined by an abscissa and an ordinate. It can be represented in one of two ways:
Consequently, given a private key, we need a mechanism to specify whether a compressed or uncompressed public key will be derived from it. The specification is done by adding the suffix ’01’ (in hexadecimal notation) to private keys from which compressed public keys are derived. We denote by WIF-compressed the format referring to the WIF-encoded private key augmented with the ’01’ suffix (i.e., corresponding to a compressed public key). We reserve the terminology WIF to refer to WIF-encoded private keys from which uncompressed public keys are derived.
priv_key_hex_aug = priv_key_hex + '01' comp_priv_key_wif = base58Check(priv_key_hex_aug, '80') print "The WIF-compressed private key is: ", comp_priv_key_wif
comp_priv_key_wif
L1pmhZ7BRLyFSnzDBp9LnscmHGjTGujnV2aQAp3yxoH8ZuD7ZBoA
An exercise similar to the one carried for WIF-encoded keys, reveals that all WIF-compressed formats start with either or
Public keys – Uncompressed representation: By multiplying the private key with the elliptic curve base point we generate the corresponding public key represented by the point on the elliptic curve. To do so, we invoke the previously introduced mul_scalar method which outputs in decimal format. We subsequently convert each coordinate to a 64-nibble long (i.e., 256-bit long) hexadecimal format . The uncompressed representation of the public key is then simply the concatenation of the hexadecimal prefix string ’04’, and We will see in the next section that we use a different prefix for compressed public keys representation.
pub_key_dec = mul_scalar(A_dec, B_dec, p_dec, G_dec, priv_key_dec) print "\nThe uncompressed public key coordinates in decimal representation (mod p_dec) are: " print "--- Abscissa: ", pub_key_dec[0] print "--- Ordinate: ", pub_key_dec[1] uncomp_pub_key_hex = '04' + comp_256bit_hex(hex(pub_key_dec[0])) + \ comp_256bit_hex(hex(pub_key_dec[1])) print "\nThe uncompressed public key in hexadecimal representation is: " print "---", uncomp_pub_key_hex
Abscissa (i.e., ):
82614379957574717635004007156871918815595721887726642301310338888550373918331
Ordinate (i.e., ):
109303218834573473139063892787029135592762135761853446650008917532324659076912
uncomp_pub_key_hex
04 B6A6 14FE BD17 5CCA 507B 3DD1 78C8 07E0 E18B 9D76 6A80 95E9 A90C EAB0 36B4 067B F1A7
6DF3 E94D 01F5 6CA3 B3CA 4B43 829C B87A C3A4 90BA C062 1A9D 12FB FC9E 1F30
Public keys – Compressed representation: The aforementioned shorter representation of public keys consists in storing the abscissa of the public key alongside the parity of its ordinate If the parity is even, we concatenate the hexadecimal prefix string ’02’ with If it is odd, we use the hexadecimal prefix string ’03’
(pub_key_dec_x, pub_key_dec_y) = pub_key_dec if (pub_key_dec_y % 2 == 0): comp_prefix = '02' else: comp_prefix = '03' comp_pub_key_hex = comp_prefix + comp_256bit_hex(hex(pub_key_dec[0])) print "\nThe compressed public key in hexadecimal representation is:" print "---", comp_pub_key_hex
comp_pub_key_hex
L1pmhZ7BRLyFSnzDBp9LnscmHGjTGujnV2aQAp3yxoH8ZuD7ZBoA
In essence, a Bitcoin address is a construct used to conveniently represent a destination of funds. It is important to highlight that an address is not a wallet and does not carry fund balances. As we will see in the post on Bitcoin transactions, whenever a particular address is used to spend some of its Bitcoins, all of its content gets debited: part of it goes to the recipient, part of it gets paid as a fee to the miner, and the remaining balance (if any) gets stored in a new address known as a change address.
Any person or entity can have as many Bitcoin addresses as they please. As a matter of fact, it is recommended to create a new address per new transaction (a practice that modern wallets implement by default).
More specifically, Bitcoin addresses are strings of alphanumeric characters that can start with either “1” or “3”. Note that there is also a new address type known as Bech32 that starts with bc1 instead. It is a segwit address but is not widely adopted ( of existing Bitcoins as of the time of this writing [2]). We will not cover it in this post and the reader interested in learning more about it can refer to e.g., [1]. Fundamentally, the two types of addresses (i.e., starting with “1” or with “3”) correspond to the following two cases:
The first type is known as a Pay to Public Key Hash address or P2PKH. These addresses always start with “1”. The name rationale stems from the fact that all that is needed to create the address is a hash of the public key as we will see shortly. To spend the funds, the recipient signs a new transaction using her private key. A two-step verification mechanism is then conducted: First, the system compares the address used as a source of funds with the one derived from the signer’s public key. In case of a match, a second step validates whether the signature corresponds to the sender’s public key or not. A match would indicate that the signer is the legitimate owner and can spend the funds without further constraints. We discuss P2PKH transactions in a later post.
The second type is known as a Pay to Script Hash address or P2SH. These addresses always start with “3”. They tend to be more complex than P2PKH in the sense that certain rules must be observed in order to unlock the funds. These rules require more than the provision of a single public key hash and of a signature derived from an appropriate private key. Applicable rules or conditions are captured in a construct known as a redeem script. The P2SH name rationale stems from the fact that all that is needed to create the address is a hash of the script. An example of a script would be an M-of-N multisignature, whereby it is required to have a minimum of M out of a total of N permissible signatures in order to unlock and spend the funds associated with that address. A single entity cannot spend them and hence a single private key is not enough. We discuss P2SH transactions in a later post.
P2PKH addresses: In order to derive the Bitcoin address associated with a given public key we make use of two one-way hash functions, namely sha256 and RIPEMD-160. Whereas sha256 outputs 256-bit long digests (i.e., 32 bytes), RIPEMD160 outputs 160-bit long digests (i.e., 20 bytes). The procedure is as follows:
Note that the last step is similar to that used to encode private keys in WIF format. There are two differences however:
Here is a python code that generates the P2PKH addresses using a compressed or uncompressed public keys
first_sha256 = hashlib.sha256(binascii.unhexlify(uncomp_pub_key_hex)).hexdigest() h = hashlib.new('ripemd160')# RIPEMD160 is not in core hashlib. We instantiate it separately h.update(binascii.unhexlify(first_sha256))# Run RIPEMD160 on the binary rep of first_sha256 output = h.hexdigest()# Convert the result into hex format uncomp_btc_add = base58Check(output, '00') print "\nThe P2PKH Bitcoin address associated with the uncompressed public key: ", uncomp_btc_add first_sha256 = hashlib.sha256(binascii.unhexlify(comp_pub_key_hex)).hexdigest() h = hashlib.new('ripemd160') h.update(binascii.unhexlify(first_sha256)) output = h.hexdigest() comp_btc_add = base58Check(output, '00') print "The P2PKH Bitcoin address associated with the compressed public key: ", comp_btc_add
uncomp_btc_add
1nBXeFe4ZtXMUwhJou8TbPZrPCrwtPKEm
comp_btc_add
19rAiJBZDFoV36aq6xbjtgCHNvouXKdsTw
P2SH addresses: The procedure used to derive a P2SH address is similar to that employed to derive P2PKH addresses. The difference is two-fold:
As an example, consider the following redeem sript in hexadecimal:
52210232cfef1f9ec45bef08062640963aa8d6b15062c9c9e51c26682369969ba9101a
21029e1f52d753a7c68fb17adaa0b19f6b02f1266245186bc487c691743b6086ed5021
03d0fabbd163dd3a6ccf382b5e640622e9075f2676443499195d9b5f3e4c11993b53ae
For those interested, this script was retrieved on the blockchain (e.g., use blockchain.info) from the transaction with the following id:
38d8d5ad0fad303f7cebd9b7363f22d80f22576ba36846ae44e83ac32615472c
Here is a python code that generates the P2SH address associated with this script
part_1 = "52210232cfef1f9ec45bef08062640963aa8d6b15062c9c9e51c26682369969ba9101a" part_2 = "21029e1f52d753a7c68fb17adaa0b19f6b02f1266245186bc487c691743b6086ed5021" part_3 = "03d0fabbd163dd3a6ccf382b5e640622e9075f2676443499195d9b5f3e4c11993b53ae" red_script_hex = part_1 + part_2 + part_3 first_sha256 = hashlib.sha256(binascii.unhexlify(red_script_hex)).hexdigest() h = hashlib.new('ripemd160') # RIPEMD160 is not in core hashlib. We instantiate it separately h.update(binascii.unhexlify(first_sha256)) # Run RIPEMD160 on the binary rep of first_sha256 output = h.hexdigest()# Convert the result into hex format P2SH_btc_add = base58Check(output, '05') print "\nThe P2SH Bitcoin address associated with the redeem script is: ", P2SH_btc_add
P2SH_btc_add
38ttADsJCpMuzw8M6gEdLYxBrYFHp1FmWu
An exercise similar to the one carried for WIF-encoded keys, reveals that all P2SH addresses always start with a 3.
Below is a chart summarizing the interralation between private keys, public keys, P2PKH and P2SH addresses
[1] Bech32.
[2] Bech32 statistics.
[3] Andrea Corbellini. Elliptic curve cryptography, a gentle introduction.
The post Bitcoin – Private key, Public key, and Addresses appeared first on Delfr.
]]>The post Monero – Content appeared first on Delfr.
]]>I assume that the reader is familiar with basic probability theory, modulo arithmetic, as well as group theoretic concepts including the notions of cyclic groups and elliptic curve groups over finite fields. A concise introduction to group and field theory can be found in this post, and an introduction to elliptic curve groups in this one.
[1] J. Herranz and G. Saez. Forking lemmas in the ring signatures’ scenario. Proceedings of INDOCRYPT’03, Lecture Notes in Computer Science(2904):266-279, 2003.
[2] J. K. Liu, V. K. Wei, and D. S. Wong. Linkable spontaneous anonymous group signature for ad hoc groups. ACISP, Lecture Notes in Computer Science(3108):325-335, 2004.
[3] Greg Maxwell. Confidential transactions, 2015.
[4] S. Noether and A. Mackenzie. Ring condential transactions. Monero Research Lab, 2016.
[5] D. Pointcheval and J. Stern. Security arguments for digital signatures and blind signatures. Journal of Cryptology, 2000.
[6] N. Van Saberhagen. Cryptonote 2.0. , 2013.
The post Monero – Content appeared first on Delfr.
]]>The post Elliptic Curve Groups – Crypto Theoretical Minimum appeared first on Delfr.
]]>The sempiternel question of how to gain and maintain power has haunted the minds of humanity’s brightest and darkest since the dawn of civilization. Be it physical (e.g., military) or economical (e.g., wealth), power’s very existence relied in part on access to information. Asymmetric information that is. Numerous are history’s examples that demonstrate how entities that knew what others didn’t and that were able to act on it, benefited from an unfair advantage. The quest for sustainable power motivates the protection of one’s proprietary information and the attempt at breaching that of the others.
Although significant in its own right, the pursuit of power is not the only motivator to conceal information. Privacy, in so far as the individual’s well-being is concerned, is another. In that respect, two areas stand out. The first is concerned with the unique nature of a human persona. As a matter of observation, and at the risk of irritating adherents of monism, the attributes of a human personality are so varied. Each attribute exists on a wide spectrum, making it unlikely that any two individuals have the same profile so to speak. The privacy spectrum is no exception, and while some live their lives as an open book, others might not even be comfortable sharing their half title page. The second area is concerned with the safety of a certain subset of individuals, e.g., whistle-blowers. They may hold sensitive information destined to be shared with a specific party. Should this information fall in the wrong hands, it could jeopardize the safety of the source.
It is therefore reasonable to assume that not every piece of information is meant to be common knowledge. One could certainly debate the merits of such a claim and in the process, revisit the very foundation of power, privacy and safety. The fact remains however, that information can be a source of influence, discomfort, and danger. One way of protecting specific content and limiting its access to intended parties only, is through the use of encryption and decryption algorithms.
Symmetric-key vs. public key cryptography: Encryption can be thought of as a map that takes a relevant piece of data known as a message, and outputs an altered version of it. The map can be either one of two types: 1) PT-invertible, or 2) One-way.
As an example, consider a message space consisting of the case-agnostic latin alphabet of 26 letters. We represent each letter by its numerical equivalent (e.g., letter “a” or “A” represented by 1), and apply the following affine map:
where the superscript denotes a string of arbitrary length, and are pre-defined elements in such that is relatively prime to 26.
For instance, let and The word “chaos” has a representation given by (3, 8, 1, 15, 19). When fed to the map, one obtains an output given by This corresponds to “nchxj”.
The inverse map can be written as follows:
where is the inverse of in modulo 26 arithmetic (i.e., Since, and 26 are relatively prime, one can use the extended euclidean algorithm outlined in the Groups and Finite Fields post to calculate in polynomial time. Consequently, is a PT-invertible map. All that is required to build it and its inverse is the pair This pair constitutes the symmetric key shared between the sender and the recipient.
Symmetric-key cryptography is known to be efficient, with the possibility of encrypting and decrypting large amounts of data relatively fast. Its weakness however, lies in the fact that the secret key must be shared between two (or more) parties over a secure channel. Enforcing perfect security and eliminating the risk of leakage over a digital communication channel is a challenging endeavour. Moreover, if such a secure medium of communication could be constructed, it would be legitimate to question the usefulness of sharing a secret key in the first place as opposed to using the secure channel to directly send and receive the actual messages.
Algorithms where the information needed to encrypt is different than that needed to decrypt, form the basis of asymmetric cryptography. The nomenclature is a reflection of the informational asymmetry between encryption and decryption. More specifically, each recipient is associated with a key pair consisting of a unique private key only known to her, and a related public key that can be shared with anyone. Anyone can use the public key of a recipient to encrypt a message. Decryption however, requires knowledge of the private key which is only known to the recipient. In light of the above, a crucial criterion in the design of key pairs is that no entity should be able to derive the private key from the public one. The dual-key architecture is the reason why asymmetric cryptography is also known as public-key cryptography.
As an example we consider the RSA encrytion scheme. RSA generates the public and private keys of a user as follows:
Select two very large primes and such that
Let One can observe that given and it is easy to compute . However, given , it is extremely challenging to find and . This is known as the factoring problem, thought to be intractable on the group .
Find ‘s totient value Euler’s totient function returns the number of integers less than or equal to that are relatively prime to . If is prime, then . In addition, for any two coprime numbers and , . Consequently,
To encrypt a message destined to Bob, Alice first transforms it into an integer by using a common-knowledge pre-defined mapping. We require that and be coprime. Subsequently, Alice computes the encrypted value where is Bob’s public key. In order to decrypt the message, Bob uses his private key to compute To see why this works, note the following equalities:
We can invoke Euler’s theorem that states that if and are relatively prime, then (proof ommitted). We conclude that
Note that when the public key is known, it is straightforward to compute However, calculating its inverse (i.e., finding the value of when and are known) is thought to be hard. On the other hand, knowledge of the private key allows a quick retrieval of as we saw above.
A downside of public key cryptography is that it is not nearly as efficient as its symmetric counterpart, especially as the message size increases. However, symmetric-key cryptography depends on the existence of a secure channel which is challenging to build. The upside of asymmetric cryptography is that it bypasses the need for a secure channel altogether. It turns out that one can leverage the advantage of each type of cryptography to create a hybrid system that is both secure and efficient. This is accomplished through the use of a key-exchange protocol known as a Diffie-Hellman exchange.
The idea is to simply apply public-key cryptography to communicate a shared secret, which can then be used in a symmetric-key setting to encrypt and decrypt larger messages. Since the secret-key is relatively small (from a data standpoint), it can be encrypted rather efficiently using a public-key setting and then shared with a recipient on an untrusted channel. Larger message blocks can subsequently be effectively encoded and decoded using the secret key. More formally, an example of this setting can be described as follows:
( times)
( times)
As noted earlier, the most important design criterion is to ensure that the secret key cannot be derived from the public key. In our setting, this means that when given and no one should be able to calculate in polynomial time the value of This is known as the discrete logarithm (DL) problem and we will revisit it. On the multiplicative group of a finite field of large prime order, the DL problem is thought to be hard.
Digital signatures: Encryption schemes help protect the content. However, they provide no proof that a certain sender was the actual author. This is true especially in the context of public-key cryptography where encryption keys are made public, allowing any party to claim that it was the actual sender. This problem can have drastic consequences when dealing with cryptocurrencies. Indeed, a cryptocurrency transaction consists of a message whose content allows a transfer of spending control from one owner to another. In Bitcoin for example, all valid transactions are publicly registered on the blockchain, and their content is purposefully not encrypted in order to enforce transparency and allow nodes to validate or reject them. However, the message in this case must be accompanied by a proof that the sender is actually the initiator of the transaction. Otherwise, anyone could initate a transaction on behalf of someone else without their consent, potentially causing financial mayhem.
The authentication process is done through the use of a mathematical construct known as a digital signature. In the context of cryptocurrencies, we care about digital signature and less so about encryption. The most important attribute of a digital signature is that of unforgeability. This can be defined in a variety of ways, but for all practical matters we mean resilience against existential forgery in the adaptive chosen-message attack. More details about digital signatures and the definitions of forgery can be found in the post entitled Digital Signature and Other Prerequisites. Generally speaking, digital signatures use the same public-key cryptography infrastructure described earlier for encryption and decryption. The sender signs with her private key in order to authenticate the message. Anyone on the network can then verify that she was the actual sender by running a verification algorithm that relies on the sender’s public key. Various examples of digital signatures including Schnorr, RSA, generic Pointcheval & Stern models, as well as a number of more elaborate ring signature schemes can be found in previous posts.
The discrete logarithm problem: A necessary condition to avoid forgery is that no one should be able to derive the private key from the public one. Here too, we realize the cruciality of one-way constructs. An important example of such a construct is the one associated with the DL poblem encountered earlier. The hardness of the DL problem on some well-defined groups underlies the security of various digital signature schemes, including those adopted in cryptocurrencies. Formally, we define the DL problem as follows:
The smallest that satisfies the above equation is known as the logarithm of in base and we write
The difficulty associated with calculating knowing and depends on the underlying group On some groups, the problem is easy to solve (i.e., we know of polynomial-time algorithms that can solve it). On others, it is harder. Moreover, there exists different levels of difficulty, the highest being exponential (i.e., the only known algorithm(s) to solve the problem are exponential in time). In the context of public-key cryptography, it is always desirable to operate on groups where the hardness of the DL problem is exponential.
An example of a group on which the DL problem is easy to solve is (i.e., the group of integers modulo introduced in the Group and Finite Fields post). To see why, first note that this group is cyclic and that the equivalence class is a generator. Given the DL problem consists in finding such that
( times)
By the definition of this is equivalent to finding such that i.e., such that Consequently, one can compute efficiently using the Euclidean algorithm.
An example of a group on which the DL problem is believed to be hard is the multiplicative group of the finite-field of large prime order This group is cyclic and was introduced in the Group and Finite Fields post. The time required by the best-known algorithm to solve DL on it is [10]. The running time is sub-exponential i.e., executes faster than an exponential algorithm but is less efficient than a polynomial one.
Despite the hardness of DL on the multiplicative cyclic subgroup of a large finite field, it remains more desirable to operate on groups where the DL is thought to be exponentially hard. An example of such a group is the one associated with elliptic curves over finite fields. The elliptic curve discrete logarithm problem (ECDLP) over is thought to be exponentially hard, with the best performing algorithm requiring time [10].
By way of comparison, ECDLP on a finite field of order has an equivalent difficulty to a DL problem on the multiplicative cyclic subgroup of One implication is that cryptographic primitives based on ECDLP require significantly smaller keys. This explains why the digital signature schemes used in various cryptocurrencies (e.g., Bitcoin’s ECDSA, Monero’s MLSAG) rely on elliptic curve groups.
With this motivation behind us, we are now in a position to introduce the concept of an elliptic curve. Its theory is very rich and sits at the intersection of different branches of mathematics including analysis, geometry, and algebra. Our objective is to build a group structure based on the geometry of elliptic curves. The new group is referred to as an elliptic curve group and forms the public-key infrastructure of a number of cryptocurrencies in use today. We highlight that this introduction is limited to the minimum that we think is needed to appreciate the subject. It is by no means a comprehensive treatise. Readers interested in a detailed treatment of elliptic curve theory can consult e.g., [10]
In what follows, we first introduce an analytic view of elliptic curves over arbitrary fields. We describe the general Weierstrass form and derive a more simplified version as long as some constraints are observed. We then look at the geometry of elliptic curves over real numbers, and build a group structure after augmenting the curve with a point at infinity. The group’s binary operation, also referred to as point addition, is described geometrically and analytically. Later, we introduce the elliptic curve group over finite fields and finally, we describe the two elliptic curves used in Bitcoin and in Monero.
One way of defining an elliptic curve is as a set of points satisfying the Weierstrass general equation and given by:
The coefficients are chosen from a field and we say that is defined over Note that is purposefully left out for reasons that we keep out of scope for now. could be for instance the field of real numbers or any finite field. Recall that in the Groups and Finite Fields post we mentioned that any finite field is either of prime order or is an extension of a field of prime order where can be any positive integer. We refer to as the characteristic of the finite field and write We derive below a simplified version of the Weierstrass equation applicable only if we exclude fields of characteristics 2 and 3.
Let’s first look at the left-hand side of the equation. It is tempting to complete the quadratic in We can always find such that
We could subsequently make a change of variables by substituting with Since does not depend on we would have eliminated all terms that contain as a factor. The aforementioned equation in can equivalently be written as
One would then be tempted to conclude that except that division by 2 is not always permissible on an arbitrary field If then will not admit a multiplicative inverse on However, division by 2 is possible on all other fields. In what follows, we always assume that Consequently, we can compute and perform the change of variable. The Weierstrass equation becomes:
Letting and the elliptic curve equation becomes:
The next step consists in simplifying the right hand-side of this equation. It turns out that any cubic equation can be transformed into an equivalent one with the quadratic term eliminated. We do so by substituting variable with a variable of the form The value of is derived by first performing the substitution and then eliminating the coefficient of the quadratic term as follows:
We require that the coefficient of be equal to 0. This imposes a constraint on ‘s value which must satisfy:
If we can always find a multiplicative inverse of in and as a result, solve for The elliptic curve equation becomes:
We can relabel the variables as and let and We then obtain the simplified Weierstrass equation of an elliptic curve over a field such that
In the following section we construct a group structure over elliptic curves. The tangent to the curve at a given point will play an essential role in this construction. As a result, elliptic curves that have singularities (i.e., points where the curve is not differentiable) are not desired and will be excluded. Examples of singularities on a curve include cusps and self intersections. Analytically, a necessary and sufficient condition for a point on a curve to be singular is for the partial derivatives at to be equal to 0. For the elliptic curve equation we get:
The last equation implies that If we substitute in the first and second equations, we conclude that
is singular and
Consequently, must be a cubic root of as well as of its derivative This means that is a double root of If we let denote the third root, we get the following factorization:
By comparing coefficients, we find that , and This in turn implies that the discriminant To summarize, we showed that given an elliptic curve over a field such that we have:
is singular
The contrapositive statement allows us to derive a sufficient condition for to be non-singular. Specifically, if for all then is non-singular. Going forward, we only consider non-singular elliptic curves defined over fields of characteristic other than 2 or 3:
In what follows, we endeavour to build the elliptic curve group over finite fields. To do so, we first consider elliptic curves over and study their geometry in order to devise a natural abelian group structure. Technically, the construction is performed in the projective plane as opposed to the euclidean plane. However, we attempt to motivate and justify the build-up without delving into the technicalities of projective geometry. Finally, we adapt the binary operation of the group to the case of a finite field.
Elliptic curves over can be easily drawn on the euclidean plane. We include below the graphs of five different elliptic curves, two of which are singular and three regular.
Elliptic curves exhibit x-axis symmetry. To see why, note that on the curve, it must hold that is also on the curve. Indeed,
Moreover, by virtue of being a point on the curve. Therefore, demonstrating that is also a point on the curve.
In order to define any group, one needs to have an underlying set of elements as well as a binary operation on it that ensures that the group axioms are observed. In our case, the underlying set contains all the points in the euclidean plane that satisfy the elliptic curve equation. Note that this does not mean that they are the only elements of the set. As a matter of fact, we also include a special point and refer to it as the point at infinity. We will motivate the introduction of in the next section. For such that the underlying set of the group takes the form:
We still need to define a suitable binary operation that acts on a not-necessarily distinct pair of points in Any group must satisfy the closure axiom and so the output of the binary operation must also be a point in Intuitively, the most natural way of geometrically linking two points in the euclidean plane is with a straight line. It is hence reasonable to look at the different configurations of pairs of points on an elliptic curve defined on The diagram below summarizes the possible scenarios:
It is easy to observe that any two elliptic curve points must belong to one of these categories. As a result, these categories are commonly exhaustive. The configurations are also mutually exclusive. This should be clear except possibly for the case of an inflection point. In what follows we argue that an elliptic curve point with ordinate equals to 0 cannot be an inflection point.
0-ordinate points vs. inflection points: An inflection point of a curve is one where the curvature changes sign. Without delving deeper into the notion of curvature, this means that the second derivative of with respect to (assuming it exists on a neighborhood of ) changes sign as the values cross . Intuitively, this suggests the following necessary conditions for a point on the curve (where the second derivative is defined) to be an inflection point:
However, an inflection point could still exist even when the second derivative is not defined at that point. The definition remains the same, i.e., an inflection point is one that marks a change in the curve’s concavity. As an example, one can look at the function defined on and verify that the point is an inflection point despite the fact that is not defined at
The domain of definition of an elliptic curve over consists of the set
This is due to the fact that over the field of real numbers, square values must be non-negative. Consequently, must be greater than or equal to 0. It then holds that on As a result:
The second derivative is defined on the set
Among other things, this means that the second derivative is not defined on curve points whose ordinate is equal to 0. This however, is not enough to justify that 0-ordinate points are not inflection points. To rule out this possibility, we note that by virtue of being a cubic equation, can admit either one or three roots (not necessarily distinct) in . We can then classify non-singular elliptic curves on in two broad categories: disconnected or connected. The figures below showcase an example of each:
The curve is an example of a non-singular disconnected curve. These curves admit three distinct real roots, none of which are interior points of the domain of definition (we do not prove this statement). They are boundary points and hence cannot be crossed from right to left or vice-versa. The same applies to the connected curve which admits one real root instead of three, but whose unique root is also a boundary point of
An implication of the aforestated definition is that the abscissa of an inflection point of a curve in must be an interior point of Indeed, one should be able to cross it in order to validate a change in curvature. Consequently, points on the elliptic curve of the form cannot be inflection points since their abscissas (i.e., the real roots of ), are boundary points of
Building the binary operation: Having defined the possible configurations of a pair of points on a non-singular elliptic curve on we now focus on finding a suitable binary operation. More specifically, given two points on the curve (not-necessarily distinct), our objective is to operate on them in such a way that the output is also a point on the curve.
A rather natural way of doing so is to check if the line passing through the two points intersects the curve at another point. In what follows, we consider each configuration separately and demonstrate that the procedure outputs one or two suitable candidate points. We then show that only one of the two points is permissible (whenever they co-exist), paving the way to an algebraic description of the binary operation.
Configuration #1: and are two distinct points on the elliptic curve such that
We let and write
Configuration #2: and are two distinct points on the elliptic curve such that (and hence
Configuration #3: The two points on the elliptic curve are identical and have an ordinate equal to 0. We let the point be denoted by
Configuration #4: The two points on the elliptic curve are identical and constitute an inflection point. We let the point be denoted by
yields
Since we can cancel the factor from both sides and obtain
This is equivalent to
Configuration #5: The two points on the elliptic curve are identical, have non-zero ordinate and are not an inflection point. We let the point be denoted by
Letting and we get
Choosing a candidate: In configurations #1, #4 and #5, we ended-up with two points (symmetric about the -axis) to choose from with regard to the output of the binary operation. Only one of them safeguards the group axioms. To see which one does not, consider configuration #4:
Defining the elliptic curve group: We are now in a position to introduce the elliptic curve group on and verify that it respects the abelian group axioms.
thus defined, satisfies the abelian group axioms:
Note that for any such that is never equal to This can be readily verified by checking each configuration separately.
As a result, is the inverse of
Going forward, we only consider finite fields of prime order and do not cover extension fields. For an introduction to finite fields, we refer the reader to this post. A non-singular elliptic curve defined over a finite field of prime order differs from one defined over in the following way:
The main difference is that all computations are conducted in modulo arithmetic. In what follows, we depict the elliptic curve over and over The geometry of elliptic curves over finite fields is not as intuitive as that of those over However, we will see that the algebraic formulation of their associated group closely follows that of elliptic groups over
For example, in order to draw over we select each value of in the set and plug it into the expression We subsequently check whether the result is a quadratic residue or not by verifying whether such that is a match. We find that the euclidean representation of this curve over consists of the following 34 elements:
While on the elliptic curve exhibited -axis symmetry, on it exhibits symmetry about the horizontal line Indeed, if is a point on the curve, then so will This is because
Formally, we denote by the group associated with an elliptic curve defined over In particular:
To illustrate point addition in elliptic curve groups over finite fields, we look at and operate on points and Since and we compute
where 5 is the inverse of 25 in modulo 31 arithmetic (recall that this can be efficiently computed using the extended euclidean algorithm introduced in the Groups and Finite Fields post).
We then compute
The construction of is similar to that of except for the fact that values are computed modulo thus defined, satisfies the abelian group axioms:
As a result, is the inverse of
ECDLP, cardinality, and point multiples in Recall that the importance of elliptic curve groups over finite fields is largely derived from the exponential hardness of the DL problem on them. The Elliptic Curve Discrete Logarithm Problem (also known as ECDLP) can be stated as follows:
( times)
The notation is unusual as it is commonly written as We decide to make explicit the appearance of the operator as a reminder that it is scalar multiplication with respect to the binary operator
Finding such an when it exists is thought to be exponentially hard. In the context of crypto-assets, we don’t operate on the full set Rather, we choose an element such that order() is a very large prime. We then limit ourselves to the subgroup generated by (refer to the post on Groups and Finite Fields for an introduction to subgroups). Given and ECDLP now consists in finding the smallest integer such that We are confident that such an exists since
In the digital signature schemes used in e.g., Bitcoin and Monero, represents the private key and the public one. It is important to derive efficiently from However, as we stated earlier, it must not be polynomially feasible to compute from . The exponential hardness of ECDLP helps with the latter requirement. Moreover, one can expect that the larger the set the better. This justifies the importance of having a sense of the cardinality of also denoted In order to ensure the former requirement, we need to have an efficient polynomial-time algorithm that can compute multiples of
If is a solution, then so will In addition, we have the point at infinity As a result, since there are distinct values in we get a maximum of points in Over there are quadratic residues and an equal number of quadratic non-residues (we don’t prove this statement in this post). As a result, in the absence of any information, a random has equal probability of being a square or not. One can then calculate the expected value of the number of points in to be The German mathematician Helmut Hasse showed that
The Dutch mathematician René Schoof, relied partly on Hasse’s theorem to devise a deterministic algorithm that can compute with complexity This is known as the Schoof algorithm and its proof is beyond the scope of this post (readers interested in learning more about it can consult [9]). The important take-away is that there exists a polynomial-time algorithm to calculate the order of an elliptic curve group over a finite field.
In so far as the structure of this group is concerned, we mention without proof that is always either cyclic or the product of two cyclic groups.
for
This can be achieved in
for
Bitcoin’s cryptography relies on a particular curve known as secp256k1:
The parameters of secp256k1 can be found on page 9 of [7] and are as follows:
FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE FFFFFC2F
Each element represents a half-byte (i.e., 4 bits) known as a nibble. There are 64 nibbles corresponding to the 256-bit representation mandated by the standard.
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000007
Here is a euclidean representation of this curve when (it is not feasible to show it for ):
which in standard hex notation are given by:
79BE667E F9DCBBAC 55A06295 CE870B07 029BFCDB 2DCE28D9 59F2815B 16F81798
483ADA77 26A3C465 5DA4FBFC 0E1108A8 FD17B448 A6855419 9C47D08F FB10D4B8
Bitcoin’s public-key cryptography is hence conducted on the subgroup
which in standard hex notation is given by
FFFFFFFF FFFFFFFF FFFFFFFF FFFFFFFE BAAEDCE6 AF48A03B BFD25E8C D0364141
0000000 00000000 00000000 00000000 00000000 00000000 00000000 00000001
That means that the order of is equal to that of i.e., Since is prime, the order of must also be prime. As a result, is a cyclic group and any of its elements could serve as a generator (refer to Groups and Finite Fields for an introduction to cyclic groups).
Another noteworthy SEC2 curve is secp256r1. The “r” specifier refers to the attribute “random” since the generation of the curve parameters and relies on a supposedly random process involving a seed value fed to a hash function. The seed value as well as the other curve attributes can be found on pages 9 and 10 of [7].
There was a fair amount of questioning as to why Satoshi opted for the usage of secp256k1 as opposed to that of another curve such as secp256r1. The reason(s) remain obscure and advocates that favor one curve over the other abound (e.g., [5], [1]). The point of contention lies in the randomness involved in selecting the curve parameters:
Suffice it to say that no one can tell with certainty whether one curve is preferred over the other. Assuming no backdoor, both curves exhibit comparable security standards.
The NIST debacle surrounding the Dual_EC_DRBG algorithm pushed some people away from NIST curves and closer to curves generated in academic circles instead. Two such curves are Curve25519 and its next of kin ed25519 used in Monero. Both are elliptic curves, but are not represented in short Weierstrass form. However, they could be transformed into one and we will see how shortly.
Curve25519 was originally introduced by the German-American mathematician and cryptologist Daniel Julius Bernstein. Unlike SEC curves and some of those advocated by NIST, Curve25519 is thougt to be patent-free. It is also hailed for its faster computation of point multiples when compared to e.g., sec256r1 (NIST P-256) [6]. Moreover, it exhibits a security level comparable to that of secp256k1 and secp256r1 (assuming no backdoors). These favorable attributes paved the way for its ever-increasing adoption.
We first provide an overview of Montgomery and Twisted Edward representations of elliptic curves of which Curve25519 and ed25519 are respective examples. We show that under certain constraints, any of these representations could be transformed into a short Weierstrass counterpart using a specific isomorphism. The existence of an isomorphism makes the two curves’ respective groups equivalent and guarantees that the hardness of ECDLP is preserved on both. In the last section, we introduce the attributes of Monero’s ed25519 curve.
Twisted Edward and Montgomery representations: A Twisted Edward curve defined on a field such that with parameters and such that is one that satisifies the following equation
It turns out that if in addition, is a square in and is not, the curve will define a group structure. For our purposes, on a finite field such a curve will define a group where the superscript refers to “Edward”. One could define the binary operation from basic principles as we did earlier for curves in short Weierstrass form. However, we show shortly that each such Twisted Edward curve is equivalent to another one in short Weierstrass form. The equivalence implicitly defines a corresponding group structure associated with it. The underlying set of the group is defined as
Note that it does not contain a point at infinity. We will attempt to justify its absence when we discuss the equivalence betwen curve representations.
A Montgomery curve defined on a field such that with given parameters and such that is one that satisifies the following equation
The curve thus defined, admits a group structure assoiated with it. For our purposes, on a finite field this curve defines a group where the superscript refers to “Montgomery”. Here too, one could define the binary operation from basic principles using the chord and tangent method presented earlier for Weierstrass curves in short form. However, we show shortly that every Montgomery curve is equivalent to another one in short Weierstrass form. As is the case for Twisted Edward curves, this equivalence implicitly defines a corresponding group structure associated with it. The underlying set of the group is defined as
Note that it contains a point at infinity denoted by . The need for such a point will be addressed when we discuss the equivalence betwen curve representations.
Every Montgomery curve is equivalent to a short Weierstrass one: Starting with a Montgomery curve
where (we will justify this constraint shortly), our objective is to transform it into a short-form Weierstrass curve
where the superscript refers to Weierstrass. Moreover, must be different than 2 or 3 for the short Weierstrass form to hold, and different than for it to be non-singular.
The sought after transformation must map every point on to a point on . Let’s first exclude the point at infinity of and focus on the other points. Let’s substitute with and with This yields
In order to make the coefficient of equal to 1 (as mandated by the short Weierstrass form), it must be that so that we can multiply both sides of the equation by the modular inverse of We get
We recognize a short Weierstrass form with and For it to be valid, we still need to ensure that This means that
The constraints and can be combined into a single one given by This explains its inclusion earlier when we defined the Montgomery form.
The above derivation mapped every point of the Montgomery curve to a point on the short-Weierstrass curve. Note that the point at inifnity of the short-Weierstrass form was not attained by the previous transformation. As a result, we define a point at infinity on the Montgomery curve and map it to Consequently, we get the following injective map:
if
In order to show that this map is a bijection, we must demonstrate that it has an inverse. We claim that given such that the short Weierstrass form can be transformed into the Montgomery curve Here, parameters and are respectively given by and
This can be readily verified by substituting with and with This substitution shows that every point on the given short Weierstrass form is mapped to a point on the Montgomery curve. The only point left out is which we then map to As a result, we get the following inverse transformation:
if
Note that with the exception of and the map (and its inverse) have their two components expressed as a rational fraction in Such transformations are known as birational maps. As a result, the bijection between a Montgomery form and its associated short Weierstrass form is also referred to as a birational equivalence. One important observation is that any Montgomery form can be transformed into a short Weierstrass curve. However, the reverse is not always possible. We will not define the constraints that must be imposed on a short Weierstrass curve to admit a Montgomery counterpart. Suffice it to say that the specific values of and previously used satisfy the required constraints.
Equivalence of (certain) Twisted Edward and (certain) Montgomery curves: Starting with a Twisted Edward curve
where and requiring additionally that be a quadratic residue over and a quadratic non-residue (we will justify these constraints shortly), our objective is to transform it into a Montgomery curve given by
where This is equivalent to and
The transformation must map every point on to a point on To do so, let’s substitute with and with This substitution attains every point on the Montgomery curve except for the following points:
The change of variable dictates that if then Consequently, we must exclude the points on the Twisted Edward curve that have One can readily verify that these are the points and So let’s apply the variable substitution, keeping these two points out for now. We will treat them separately in a moment.
And since we excluded the cases and we get
Since we can multiply both sides of the equation by the modular inverse of and get
To ensure that the coefficient of is equal to 1 as mandated by the Montgomery form, we must have so that we can multiply both sides of the equation by the modular inverse of In this case, we obtain
We recognize the Montgomery elliptic curve form with and However, we must still make sure that as mandated by the definition of a Montgomery form. Clearly, We only need to make sure that This translates to which implies that
To sum-up, the variable substitution that we introduced defines a map from to given by The constraints that need to be observed are the following:
Finally, note that we left out two points on the Twisted Edward curve, namely and Observe however, that on the Montgomery elliptic curve, we also have two points that were not covered, namey the point and the point at infinity We thus define the following injective transformation:
, if
Conversely, starting with the Montgomery curve
where () and () are both quadratic non-residues over we can show that it can be transformed into a Twisted Edward curve of the form
where
To do so, we make use of the inverse of the previous substitution. We substitute