Writeup for the cryptography problems that I designed.
Overview
Given that guessy and cipher-based cryptography challenges are way too common in Bangladeshi CTFs, my goal was to introduce problems that require mathematics and algorithms, something that requires
actual problem-solving, not hours of boring guesswork that we all despise.
It was quite surprising to see so many solves for baby_RSA though๐ฒ, given it was quite difficult. And that too at like around the last 30 minutes of the CTF ๐ค. I wish I too had such ideas keep coming to me at the very last moment when Iโm playing CTFs ๐.
We can see that the same message is passed through a noise function 50 times. What the noise does it, it takes a bit, and with 30% probability, that bit is flipped. Luckily, we have enough samples to recover the correct bit at each position.
We can represent the distorted list as a matrix, where each entry represents a bit. The bit ai,jโ represents the bit numbered j at the message no. i .
The k-th column represents all the bits that are in position k of the message. According to the problem formulation, around 70% of these bits would be correct, and so, the remaining 30% bits are incorrect. How do we know which bits are correct?
What I claim is, we can just note down which bit has more occurrences than the other, and that would be the correct bit. Since the correct bit will be have around 70% occurrences, itโs trivial to understand why my claim stands.
This could be a very trivial problem, had there been no modulo operations involved. In that case we could have solved this problem with simple quadratic solvers. But whatโs the fun in that? โ:3
How do we solve modular quadratic polynomails? Something called quadratic residues comes into play here. It solve problems of the type
Y=X2modp
where the value of Y and p is known. tonelli shanks is a very powerful algorithm which can solve such equations for us. But first, let us manipulate the given equation into something simpler which can be solved directly using tonelii shanks.
By the way, do note that all divisions shown here are modular multiplication done with inverses.
And so, we now have the form that was mentioned earlier
Y=X2modp
The solution script is as follows
fromCrypto.Util.numberimportlong_to_bytesasl2b,getPrime,bytes_to_longasb2l,inversep=159117138695601086935648462476725143896981591038416901486093706070754873563937272793294757366059667782136312013577603110660003394277897367203934519905683172161079448635248857040486200482627367334414675502994685936812333774063439076898281362682381888787053539531164627370506379676784327677612773681859553362469a=183170230465848410077175594145038110799b=177960951503783858139483105160729532851c=241771663291314104599898559749454094799y=81883801483428304918741984834363388238539421922474300691841915944763281416203781633389084991872486015041884084357260412052258732056118424392145242000539134316390251752292462492913837148154805999331901388735321855253311517735430229163344305926114721299791844599487270387540543138183327842649901510556454592197inv=inverse(a,p)inv2=inverse(2*a,p)t1=(y*inv)%pt2=(c*inv)%pt3=(b*inv2)%pt3=(t3*t3)%pk=(b*inv2)%py=(t1-t2+t3+p)%p# I googled for tonelli shanks implementation and this one came firstdeflegendre(a,p):returnpow(a,(p-1)//2,p)deftonelli(n,p):assertlegendre(n,p)==1,"not a square (mod p)"q=p-1s=0whileq%2==0:q//=2s+=1ifs==1:returnpow(n,(p+1)//4,p)forzinrange(2,p):ifp-1==legendre(z,p):breakc=pow(z,q,p)r=pow(n,(q+1)//2,p)t=pow(n,q,p)m=st2=0while(t-1)%p!=0:t2=(t*t)%pforiinrange(1,m):if(t2-1)%p==0:breakt2=(t2*t2)%pb=pow(c,1<<(m-i-1),p)r=(r*b)%pc=(b*b)%pt=(t*c)%pm=ireturnrs=tonelli(y,p)-kifs<0:s+=pprint(l2b(s))
The problem can be summarized in two parts. The first part is applying the GCD attack twice to recover two pairs of (message_reduced, modulus). Then those pairs are used in Chinese Remainder Theorem to find the actual message.
Let us analyze the server script.
#!/usr/local/bin/pythonfromCrypto.Util.numberimportgetPrime,bytes_to_longasb2l,long_to_bytesasl2bprint("Welcome to delphi's query service!!")primes=[getPrime(512)for_inrange(10)]withopen('flag.txt','rb')asf:flag=f.read()m=b2l(flag)assert(m.bit_length()>1200andm.bit_length()<2000)used_indices=set()for_inrange(5):print('Enter 2 indices for primes to be used for RSA (eg. 0 4): ')i,j=map(int,input().split())ifiinused_indicesorjinused_indicesori<0orj<0ori==j:print('Illegal values given!!')exit(2)i,j=i%10,j%10used_indices.add(i)used_indices.add(j)p,q=primes[i],primes[j]n=p*qe=0x10001ct=pow(m,e,n)print('n = ',n)print('ct = ',ct)
We see that the server initially generates 10 primes that are 512 bits each. Then we can interact with the server 5 times. Each interaction is of the following type : we give the server 2 indices i, j and the server uses primesiโ and primesjโ to use as primes for RSA to encrypt our flag.
There is a catch though, we canโt reuse any indices. Nor can we use negative indices. What benefit would have negative indices given us anyway? We know that negative indices wraps around the list in python, that is, for an array of size n, the index -k actually denotes the index n - k. In that way we can reuse indices in our query.
But is that the only way we can reuse indices? There is no checking in our code whether i > n. Rather the input is taken modulo n. In that way, we can reuse the same index i using i and i + n since both of them becomes i when reduced by n.
Now that we understand how to use the same prime in two different queries, how does that help us? It helps us to factorize the RSA modulus. Letโs say we use the following queries: 0 1 and 11 2. The second query actually translates to 1 2 after being reduced modulo n.
n1โ=primes0โโprimes1โ
n2โ=primes1โโprimes2โ
If we take the GCD of those two modulus, we get primes1โ.
Using primes1โ, we can factorize both n1โ and n2โ. With the RSA modulus being cracked, we can now easily recover our flag ^^.
To spoil the mood, you would actually get a gibberish. Notice the following line in our server script:
assert(m.bit_length()>1200andm.bit_length()<2000)
Our RSA modulus is of 1024 bits. It means there will be losses of bits, that is, we would actually get flag % modulus instead of the original flag. That is why you got gibberish.
What can we do in this situation? We need a modulus that is greater than 2000 bits. Chinese Remainder Theorem is the way to go. If we use the following pair in our CRT,
reducedMessage1โ=flagmodmodulus1โ
reducedMessage2โ=flagmodmodulus2โ
Chinese Remainder Thoerem will combine the two modulus and give us a 2048 bits modulus. We are going to get flagmodmodulus1โโmodulus2โ. This is enough as the new modulus (which is the product of previous two modulus) is more than the upper limit of flag size.
With the idea ready at hand, the solution script can be coded easily.
Flag : CTF_BD{i_made_this_flag_purposefully_bigger_so_that_u_are_forced_to_use_the_chinese_remainder_theorem_otherwise_it_would_be_too_easy_if_it_was_just_the_gcd_attackxD}