对称加密实验

2022-12-30 约 10861 字预计阅读 22 分钟

SeedLab密码学对称加密实验过程记录整理，还有待进一步补充，存疑之处欢迎指正和探讨！

实验来源：seed-labs

实验环境：SEEDUbuntu 20.04 VM

Task 1: Frequency Analysis

In this lab, you are given a cipher-text that is encrypted using a monoalphabetic cipher; namely, each letter in the original text is replaced by another letter, where the replacement does not vary (i.e., a letter is always replaced by the same letter during the encryption). Your job is to find out the original text using frequency analysis. It is known that the original text is an English article.

明文加密方式详见实验指导手册：Secret-Key Encryption Lab

ciphertext.txt:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83


ytn xqavhq yzhu  xu qzupvd ltmat qnncq vgxzy hmrty vbynh ytmq ixur qyhvurn
vlvhpq yhme ytn gvrrnh bnniq imsn v uxuvrnuvhmvu yxx

ytn vlvhpq hvan lvq gxxsnupnp gd ytn pncmqn xb tvhfnd lnmuqynmu vy myq xzyqny
vup ytn veevhnuy mceixqmxu xb tmq bmic axcevud vy ytn nup vup my lvq qtvenp gd
ytn ncnhrnuan xb cnyxx ymcnq ze givasrxlu eximymaq vhcavupd vaymfmqc vup
v uvymxuvi axufnhqvymxu vq ghmnb vup cvp vq v bnfnh phnvc vgxzy ltnytnh ytnhn
xzrty yx gn v ehnqmpnuy lmubhnd ytn qnvqxu pmpuy ozqy qnnc nkyhv ixur my lvq
nkyhv ixur gnavzqn ytn xqavhq lnhn cxfnp yx ytn bmhqy lnnsnup mu cvhat yx
vfxmp axubimaymur lmyt ytn aixqmur anhncxud xb ytn lmuynh xidcemaq ytvusq
ednxuratvur

xun gmr jznqymxu qzhhxzupmur ytmq dnvhq vavpncd vlvhpq mq txl xh mb ytn
anhncxud lmii vpphnqq cnyxx nqenamviid vbynh ytn rxipnu rixgnq ltmat gnavcn
v ozgmivuy axcmurxzy evhyd bxh ymcnq ze ytn cxfncnuy qenvhtnvpnp gd 
exlnhbzi txiidlxxp lxcnu ltx tnienp hvmqn cmiimxuq xb pxiivhq yx bmrty qnkzvi
tvhvqqcnuy vhxzup ytn axzuyhd

qmruvimur ytnmh qzeexhy rxipnu rixgnq vyynupnnq qlvytnp ytncqnifnq mu givas
qexhynp iveni emuq vup qxzupnp xbb vgxzy qnkmqy exlnh mcgvivuanq bhxc ytn hnp
avheny vup ytn qyvrn xu ytn vmh n lvq aviinp xzy vgxzy evd munjzmyd vbynh
myq bxhcnh vuatxh avyy qvpinh jzmy xuan qtn invhunp ytvy qtn lvq cvsmur bvh
inqq ytvu v cvin axtxqy vup pzhmur ytn anhncxud uvyvimn exhycvu yxxs v gizuy
vup qvymqbdmur pmr vy ytn viicvin hxqynh xb uxcmuvynp pmhnayxhq txl axzip
ytvy gn yxeenp

vq my yzhuq xzy vy invqy mu ynhcq xb ytn xqavhq my ehxgvgid lxuy gn

lxcnu mufxifnp mu ymcnq ze qvmp ytvy viytxzrt ytn rixgnq qmrumbmnp ytn
mumymvymfnq ivzuat ytnd unfnh muynupnp my yx gn ozqy vu vlvhpq qnvqxu
avcevmru xh xun ytvy gnavcn vqqxamvynp xuid lmyt hnpavheny vaymxuq muqynvp
v qexsnqlxcvu qvmp ytn rhxze mq lxhsmur gntmup aixqnp pxxhq vup tvq qmuan
vcvqqnp  cmiimxu bxh myq inrvi pnbnuqn bzup ltmat vbynh ytn rixgnq lvq
bixxpnp lmyt ytxzqvupq xb pxuvymxuq xb  xh inqq bhxc enxein mu qxcn 
axzuyhmnq


ux avii yx lnvh givas rxluq lnuy xzy mu vpfvuan xb ytn xqavhq ytxzrt ytn
cxfncnuy lmii vicxqy anhyvmuid gn hnbnhnuanp gnbxhn vup pzhmur ytn anhncxud 
nqenamviid qmuan fxavi cnyxx qzeexhynhq imsn vqtind ozpp ivzhv pnhu vup
umaxin smpcvu vhn qatnpzinp ehnqnuynhq

vuxytnh bnvyzhn xb ytmq qnvqxu ux xun hnviid suxlq ltx mq rxmur yx lmu gnqy
emayzhn vhrzvgid ytmq tveenuq v ixy xb ytn ymcn muvhrzvgid ytn uvmigmynh
uvhhvymfn xuid qnhfnq ytn vlvhpq tden cvatmun gzy xbynu ytn enxein bxhnavqymur
ytn hvan qxaviinp xqavhxixrmqyq avu cvsn xuid npzavynp rznqqnq

ytn lvd ytn vavpncd yvgzivynq ytn gmr lmuunh pxnquy tnie mu nfnhd xytnh
avynrxhd ytn uxcmunn lmyt ytn cxqy fxynq lmuq gzy mu ytn gnqy emayzhn
avynrxhd fxynhq vhn vqsnp yx imqy ytnmh yxe cxfmnq mu ehnbnhnuymvi xhpnh mb v
cxfmn rnyq cxhn ytvu  enhanuy xb ytn bmhqyeivan fxynq my lmuq ltnu ux
cxfmn cvuvrnq ytvy ytn xun lmyt ytn bnlnqy bmhqyeivan fxynq mq nimcmuvynp vup
myq fxynq vhn hnpmqyhmgzynp yx ytn cxfmnq ytvy rvhunhnp ytn nimcmuvynp gviixyq
qnaxupeivan fxynq vup ytmq axuymuznq zuymi v lmuunh ncnhrnq

my mq vii ynhhmgid axubzqmur gzy veevhnuyid ytn axuqnuqzq bvfxhmyn axcnq xzy
vtnvp mu ytn nup ytmq cnvuq ytvy nupxbqnvqxu vlvhpq atvyynh mufvhmvgid
mufxifnq yxhyzhnp qenazivymxu vgxzy ltmat bmic lxzip cxqy imsnid gn fxynhq
qnaxup xh ytmhp bvfxhmyn vup ytnu njzviid yxhyzhnp axuaizqmxuq vgxzy ltmat
bmic cmrty ehnfvmi

mu  my lvq v yxqqze gnylnnu gxdtxxp vup ytn nfnuyzvi lmuunh gmhpcvu
mu  lmyt ixyq xb nkenhyq gnyymur xu ytn hnfnuvuy xh ytn gmr qtxhy ytn
ehmwn lnuy yx qexyimrty ivqy dnvh unvhid vii ytn bxhnavqynhq pnaivhnp iv
iv ivup ytn ehnqzceymfn lmuunh vup bxh ylx vup v tvib cmuzynq ytnd lnhn
axhhnay gnbxhn vu nufnixen quvbz lvq hnfnvinp vup ytn hmrtybzi lmuunh
cxxuimrty lvq ahxlunp

ytmq dnvh vlvhpq lvyatnhq vhn zunjzviid pmfmpnp gnylnnu ythnn gmiigxvhpq
xzyqmpn nggmur cmqqxzhm ytn bvfxhmyn vup ytn qtven xb lvynh ltmat mq
ytn gvrrnhq ehnpmaymxu lmyt v bnl bxhnavqymur v tvmi cvhd lmu bxh rny xzy

gzy vii xb ytxqn bmicq tvfn tmqyxhmavi xqavhfxymur evyynhuq vrvmuqy ytnc ytn
qtven xb lvynh tvq  uxcmuvymxuq cxhn ytvu vud xytnh bmic vup lvq viqx
uvcnp ytn dnvhq gnqy gd ytn ehxpzanhq vup pmhnayxhq rzmipq dny my lvq uxy
uxcmuvynp bxh v qahnnu vayxhq rzmip vlvhp bxh gnqy nuqncgin vup ux bmic tvq
lxu gnqy emayzhn lmytxzy ehnfmxzqid ivupmur vy invqy ytn vayxhq uxcmuvymxu
qmuan ghvfntnvhy mu  ytmq dnvh ytn gnqy nuqncgin qvr nupnp ze rxmur yx
ythnn gmiigxvhpq ltmat mq qmrumbmavuy gnavzqn vayxhq cvsn ze ytn vavpncdq
ivhrnqy ghvuat ytvy bmic ltmin pmfmqmfn viqx lxu ytn gnqy phvcv rxipnu rixgn
vup ytn gvbyv gzy myq bmiccvsnh cvhymu capxuvrt lvq uxy uxcmuvynp bxh gnqy
pmhnayxh vup vevhy bhxc vhrx cxfmnq ytvy ivup gnqy emayzhn lmytxzy viqx
nvhumur gnqy pmhnayxh uxcmuvymxuq vhn bnl vup bvh gnylnnu

利用python对文本进行频率统计

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22


#!/usr/bin/env python3
from collections import Counter
import re
TOP_K  = 20
N_GRAM = 3
# Generate all the n-grams for value n
def ngrams(n, text):
    for i in range(len(text) -n + 1):
        # Ignore n-grams containing white space
        if not re.search(r'\s', text[i:i+n]):
           yield text[i:i+n]
# Read the data from the ciphertext
with open('ciphertext.txt') as f:
    text = f.read()
# Count, sort, and print out the n-grams
for N in range(N_GRAM):
   print("-------------------------------------")
   print("{}-gram (top {}):".format(N+1, TOP_K))
   counts = Counter(ngrams(N+1, text))        # Count
   sorted_counts = counts.most_common(TOP_K)  # Sort 
   for ngram, count in sorted_counts:                  
       print("{}: {}".format(ngram, count))   # Print

助理解资料：

Python计数器collections.Counter用法详解

Python yield 使用浅析

统计结果：

根据统计结果，利用维基百科提供的单字母相对频率分布、双字母组合、三字母组合进行频率分析。

根据文本中单字母出现最高频一般都是e(13%)的统计过规律，考虑将密文中单字母最高频的n替换成e，根据常规文本统计中双字母出现频率最高的是th(3.56%)和三字母中最高的是the(1.81%)，结合刚才的单字母替换和密文统计规律(双字母yt，三字母ytn)，合理考虑密文y替换成明文t，密文t替换成明文h。

For those characters, you may want to change them back to its plaintext, as you may be able to get more clues. It is better to use capital letters for plaintext, so for the same letter, we know which is plaintext and which is ciphertext. => 因此密钥表中密文也是直接用小写，明文用大写示意。

Encryption Key(加密密钥表)：

明文 A B C D E F G H I J K L M

密文 n t

明文 N O P Q R S T U V W X Y Z

密文 y
将已进行猜测过的替换修改进密文文本中进一步观察分析：tr 'ytn' 'THE' < ciphertext.txt > plaintext.txt

仔细分析文本，可以发现这里面出现了很多的vT，T是没有问题的，那对于t前面可能的搭配只有it or at ，根据维基百科双字母的频率分布，暂考虑是at(1.49%)，也就是v~A。再看这里面出现了很多Tx，同理可能的搭配只有to，因此考虑x~O。

Encryption Key(加密密钥表)：

明文 A B C D E F G H I J K L M

密文 v n t

明文 N O P Q R S T U V W X Y Z

密文 x y
替换后进一步分析：tr 'xv' 'OA' < plaintext.txt > plaintext1.txt

借助python(略修改)再扫一遍频率：

对照维基百科，高频的E、T、A、O已经出现，接下来是I、N。二维中TH、HE已经出现，IN、ER、AN和RE还未出现。三维中THE已出现，AND、THA、ENT、OFT还未出现。

结合密文现状，文中有很多Aup，考虑u对应N，p对应D；在u为N的情况下，考虑m为I；根据三维中OzT出现的频率高，考虑z为F。检验效果。

Encryption Key(加密密钥表)：

明文 A B C D E F G H I J K L M

密文 v p n z t m

明文 N O P Q R S T U V W X Y Z

密文 u x y
替换进密文中观察分析：tr 'upmz' 'NDIF' < plaintext1.txt > plaintext2.txt

替换完后看到TFhN，感觉不太对应该没有什么tf*n的，后面F出现的位置也很怪异，因此F换回z。

相反观察到非常多的Ob单词，考虑到的搭配是of、ok、or（on已出现）。根据频率统计优先级是or、of(先保留)。

观察到THIq，考虑会不会是this，q~S，并且q出现的词频是非常高的，符合统计规律中S的词频，再加上后面有qAID 感觉是said合理。

文中出现了个长词：NONArENAhIAN，通过检索，很可能的原词是nonagenarian，因此考虑r~G，h~R。

由此上面b~F。

Encryption Key(加密密钥表)：

明文 A B C D E F G H I J K L M

密文 v p n b r t m

明文 N O P Q R S T U V W X Y Z

密文 u x h q y
替换进密文中观察分析。

已经有些成型了，坚持！

看到AlARDS，猜测是awards，也就是l~W；

OSaARS，猜测oscars，即a~C；

OzTSET，猜测outset，即z~U；

NATIONAi，猜测national，即i~L；

SzNDAd，猜测Sunday，即d~Y；

AgOzT，猜测about，即g~B；

Encryption Key(加密密钥表)：

明文 A B C D E F G H I J K L M

密文 v g a p n b r t m i

明文 N O P Q R S T U V W X Y Z

密文 u x h q y z l d
替换进密文中分析：

jUESTION猜测question，即j~Q；

IN TERcS OF 猜测in terms of，即c~M；

eROBABLY猜测probably，即e~P；

oUBILANT猜测jubilant，即o~J；

fOTES猜测votes，即f~V；

LIsELY猜测likely，即s~K；

SEkIST猜测sexist，即k~X；

则最后一个w对应Z；

Encryption Key(加密密钥表)：

明文 A B C D E F G H I J K L M

密文 v g a p n b r t m o s i c

明文 N O P Q R S T U V W X Y Z

密文 u x e j h q y z f l k d w

明文	A	B	C	D	E	F	G	H	I	J	K	L	M
密文	v			p	n	z		t	m

明文	A	B	C	D	E	F	G	H	I	J	K	L	M
密文	v			p	n	b	r	t	m

明文	A	B	C	D	E	F	G	H	I	J	K	L	M
密文	v	g	a	p	n	b	r	t	m			i

明文	N	O	P	Q	R	S	T	U	V	W	X	Y	Z
密文	u	x			h	q	y	z		l		d

明文	A	B	C	D	E	F	G	H	I	J	K	L	M
密文	v	g	a	p	n	b	r	t	m	o	s	i	c

明文	N	O	P	Q	R	S	T	U	V	W	X	Y	Z
密文	u	x	e	j	h	q	y	z	f	l	k	d	w

最后根据密钥表就可以将最终明文给破译出来了！！明文(部分)如下：

Task 2: Encryption using Different Ciphers and Modes

You can use the following openssl enc command to encrypt/decrypt a file.

1
2
3


$ openssl enc -ciphertype -e -in plain.txt -out cipher.bin \
-K 00112233445566778889aabbccddeeff \
-iv 0102030405060708

-in input file
-out output file
-e encrypt
-d decrypt
-K/-iv key/iv in hex is the next argument => key是密钥，iv是initialization vector
-[pP] print the iv/key (then exit if -P)

查看具体使用说明man openssl、man enc

Task target: In this task, you should try at least 3 different ciphers.

理论知识参考：

AES五种加密模式（CBC、ECB、CTR、OCF、CFB）

openssl 对称加密算法enc命令详解

CFB模式解读

-aes-128-cbc

密码分组链接模式(Cipher Block Chaining (CBC)): 这种模式是先将明文切分成若干小段，然后每一小段与初始块或者上一段的密文段进行异或运算后，再与密钥进行加密。

实验：

将task1中解密出的文本plain.txt(plaintext.txt备份重命名)作为明文文本: cp ./plaintext.txt ./plain.txt
加密命令：openssl enc -aes-128-cbc -e -in plain.txt -out cipher.bin -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p

注意：key和iv值是16进制

其中salt是盐值(本次演示不涉及)。为了增强安全性，在把用户密码(pass参数)转换成加密密钥的时候需要使用盐值，默认盐值随机生成。使用-S参数，则盐值由用户指定。也可指用-nosalt指定不使用盐值，但降低了安全性，不推荐使用。

Tip

首先补充一下关于key和iv的位数疑惑：密钥128位？看执行命令后的提示也是由于太短位数不够，用0字节填充至指定长度？但是看-p参数打印出来的补齐现状发现这不是只有32位？何来128位？ => 忽略了前面的注意提醒和对单位换算理解不敏感，首先这32位字符是16进制，前面所提128位是指128bit(16字节)，这里已经可以感受量纲的不同，一位16进制字符可以用4个二进制位来表示，即1对4bit的关系，因此实际上正是32x4=128bit。
查看加密后的文本cipher.bin
解密cipher.bin文件至plain1.txt：openssl enc -aes-128-cbc -d -in cipher.bin -out plain1.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p
查看解密后的文件plain1.txt

-aes-128-cfb

密码反馈模式(Cipher FeedBack (CFB)): 前一个密文分组会被送回到密码算法的输入端。所谓反馈，这里指的就是返回输入端的意思。=> 再写一遍引用：CFB模式解读可以阅读一下，理清了和cbc、ecb等的区别。

实验:

明文仍旧使用plain.txt
加密：openssl enc -aes-128-cfb -e -in plain.txt -out cipher1.bin -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p

查看结果：
解密：openssl enc -aes-128-cfb -d -in cipher1.bin -out plain2.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p

查看结果：

Tip！

通过这2个小实验的研究，我觉得对于理解算法和模式这两个概念是更加有想法了。参考《图解密码技术》，对于模式的粗略说法是如DES每次只能加密64比特的数据，如果要加密的明文比较长，就需要对DES加密进行迭代(反复)，而迭代的具体方式就称为模式。原先有些抽象，觉得模式和算法这不是一回事吗？都是在数据加密啊？现在看法有所改观：首先从实例上看，如DES、AES这些称作算法，CFB、CBC、EBC这些称作模式；再次从上述原理图来理解，表述在下图了(纯个人当下所想，描述出来有点舒服~hhhh，不正确之处还请指正!)。

像所谓的DES本质是16轮Feistel网络，其实它只是加密算法，他只在加密一个分组，向上面书中提到的64bit，它16轮之后是完成了对这一分组的加密，就理解成下面的加密器，你输入一个分组，他按照des算法16轮后给你一个密文分组。但是对于一个很长的明文段加密工作来说这只是1/n步(明文分成n个分组)，每个分组是直接过一遍加密器拼起来，还是前一个分组加密后作为后一个分组加密时密钥计算的输入，前后关联起来，这就涉及到每个分组加密时迭代的方式了，也就是所谓的模式。

参考资料：

结城浩, and 周自恒. 图解密码技术. 北京: 人民邮电出版社, 2015. Print. 图灵程序设计丛书 Tu Ling Cheng Xu She Ji Cong Shu.

Katz J , Lindell Y . Introduction to Modern Cryptography[M]. Chapman & Hall/CRC, 2007.

有些啰嗦了…有精力还是建议读一遍书↑。

-aes-128-ecb

电码本模式(Electronic Codebook Book (ECB)): 这种模式是将整个明文分成若干段相同的小段，然后对每一小段进行加密。

实验同理：

加密：openssl enc -aes-128-ecb -e -in plain.txt -out cipher2.bin -K 00112233445566778889aabbccddeeff -p 这边不需要-iv参数。
解密：openssl enc -aes-128-ecb -d -in cipher2.bin -out plain3.txt -K 00112233445566778889aabbccddeeff -p

-bf-cbc

实验同理：

加密命令：openssl enc -bf-cbc -e -in plain.txt -out cipher3.bin -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p
解密：openssl enc -bf-cbc -d -in cipher3.bin -out plain4.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p

Task 3: Encryption Mode – ECB vs. CBC

原图：

Target：We would like to encrypt this picture, so people without the encryption keys cannot know what is in the picture.

实验(ECB模式)

Encrypt the file using the ECB (Electronic Code Book) mode：

openssl enc -aes-128-ecb -e -in pic_original.bmp -out ecb_file.bmp -K 00112233445566778889aabbccddeeff -p
直接查看加密后的图片，用image viewer(eog)打开：

报错信息是这张bmp图片有 假冒、伪造 头数据，因此无法加载！

For the .bmp file, the first 54 bytes contain the header information about the picture, we have to set it correctly, so the encrypted file can be treated as a legitimate .bmp file. => 从第55字节开始到最后是图片内容
设置正确的header：We will replace the header of the encrypted picture with that of the original picture.

可以不采取下述命令，直接用bless编辑修改二进制文件也可。
- Get the header from pic_original.bmp: head -c 54 pic_original.bmp > header
  
  head命令详解：Linux head 命令
- Get the data from ecb_file.bmp (from offset 55 to the end of the file): tail -c +55 ecb_file.bmp > body
  
  tail命令详解：Linux tail 命令
- Combine the header and data together into a new file: cat header body > ecb_file_ok.bmp
再次使用image viewer(eog)打开图片：

正常打开！图片内容已经过加密，但是ECB加密模式下只有颜色有所改变，但仍能观察出图片原来的形状轮廓，椭圆+矩形以及相对位置等，可以从加密后图片提炼出原图片的大概信息。

实验(CBC模式)

同理ECB操作流程：

加密：openssl enc -aes-128-cbc -e -in pic_original.bmp -out cbc_file.bmp -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p
设置正确的header
查看加密后的图片：

图片已经完全被加密，对比原图和ecb模式，cbc模式加密下的图片完全呈雪花状，不存在ecb中的轮廓可见等现象，肉眼观察下没有能提炼出关于原图的有价值的信息，一片高糊，屏蔽隐藏了原图的信息。

自选图片观察

原图：

实验流程同上，首先是ECB模式加密后图片：

这是CBC模式加密后图片：

观察结果同理上述结论，使用ECB模式虽能隐藏图片的具体样式信息等，但仍能提取出关于原图的大概轮廓。而使用CBC模式则加密的比较彻底，完全雪花状，无法观察出任何有关原图的有价值信息。

Task 4: Padding

The PKCS#5 padding scheme is widely used by many block ciphers. We will conduct the following experiments to understand how this type of padding works.

分别创建大小为5，10，15字节的文件
1 2 3

echo -n "12345" > f1.txt echo -n "1234567890" > f1.txt echo -n "678901234513579" > f1.txt
Without the -n option, the length will be 6, because a newline character will be added by echo.
使用AES加密算法(128位密钥，CBC模式)加密上述文件：openssl enc -aes-128-cbc -e -in f1.txt -out f1e.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p
查看加密后文件大小

很明显文件变大了，在加密的过程中有往里面填充。
解密文件，查看填充的内容：openssl enc -aes-128-cbc -d -in f1e.txt -out f1d.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -nopad

Unfortunately, decryption by default will automatically remove the padding, making it impossible for us to see the padding. However, the command does have an option called “-nopad”, which disables the padding, i.e., during the decryption, the command will not remove the padded data.

观察大小，可以发现填充的字符解密后仍在：
用xxd命令查看各个解密文件中padding的内容：xxd f1d.txt

-aes-128-cbc模式加密下填充的字符都是.符。

为什么加密后的文件的都是16字节或者说会被填充到16字节 => 因为AES加密算法分组是128位即16字节，因此一个分组若不满16字节(如上)，会被填充至16字节，若一个文件是16字节大或者32字节大等，则不会进行额外的填充。(?) => 已验证，错误想法😅，16字节文件在cbc模式下会被填充至32字节。

进一步观察其他模式下填充的内容：

ECB模式

依旧是填充到16字节。

仍然是用.填充。
CFB模式

可以发现CFB模式下加密时未进行填充，仍为原文件大小。

查看内容确实也没有填充字符。
OFB模式

OFB模式也未进行填充，大小仍一致。

查看内容确实无填充字符。

疑惑

关于为什么同一个算法不同模式有的需要填充有的不需要值得在查阅资料！我现在的一个想法是因为不同模式迭代方式不同，或者说有的是明文直接进加密器(应用加密算法)再配合迭代，那需要满足分组要求(128位等)，有的对于明文本身其实没有直接用加密算法，而是将他和加密出的密钥流进行XOR运算加密的，那其实不要求进行填充，因为它不用进算法。(待考证)

Task 5: Error Propagation – Corrupted Cipher Text

创建一个至少1000字节大的文本文件

文件内容均为1
用AES加密算法(128位密钥，ECB模式)加密这个文件：openssl enc -aes-128-ecb -e -in f.txt -out fecb.txt -K 00112233445566778889aabbccddeeff -p
用bless工具修改密文中第55个字节中的1个bit

如下：
用正确的密钥和初始向量解密被上述修改后的密文：openssl enc -aes-128-ecb -d -in fecb.txt -out fecbd.txt -K 00112233445566778889aabbccddeeff -p

查看解密后的明文：

可以很清楚发现虽然我只修改了密文的一个bit，并使用了正确的密钥和初始向量，但是解密后对明文的影响仍是蔓延了将近14，15个字节的乱码，乱码之处完全无法看出原来正确明文的痕迹，可以想象，若错误的比特数增多几个，对解密的影响是非常大的。 => Avalanche Effect

进一步使用AES算法128位密钥其余模式：

CBC模式

加密：openssl enc -aes-128-cbc -e -in f.txt -out fcbc.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p

解密：openssl enc -aes-128-cbc -d -in fcbc.txt -out fcbcd.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p

同样也是多字节乱码。
OFB模式

OFB模式不同于上述两种模式，并没有产生雪崩效应，解密被破坏的密文可以发现，上述错误出现在第55字节，即解密后的明文中只有被修改的那个字节出现了问题，其余字节均不受影响。

Task 6: Initial Vector (IV) and Common Mistakes

6.1 IV Experiment

用-aes-128-cbc模式，采用相同的密钥和不同的IV加密一个相同的文件f.txt

命令1：openssl enc -aes-128-cbc -e -in f.txt -out f6cbc1.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p

命令2：openssl enc -aes-128-cbc -e -in f.txt -out f6cbc2.txt -K 00112233445566778889aabbccddeeff -iv 0102030409080706 -p
查看加密结果：

cat f6cbc1.txt

cat f6cbc2.txt

单看加密文档结尾部分也可发现加密出的结果是不相同的。
用-aes-128-cbc模式，采用相同的密钥和相同的IV加密一个相同的文件f.txt

命令：openssl enc -aes-128-cbc -e -in f.txt -out f6cbc3.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p
查看两次加密的结果：

cat f6cbc3.txt

cat f6cbc4.txt

可以发现两次加密的结果是完全一样的！
思考为什么IV必须唯一！

通过上述实验可以发现，相同的密钥情况下，不同的IV加密同样内容结果不同，同样的IV加密同样的内容结果是相同的。换句话说即在CBC模式下，重复使用 IV 会导致带有相同部分的明文加密结果是相同部分的密文。显然存在安全威胁，重复使用可能会保留下明文中存在的敏感特征(类比上面实验中你可能确实加密了内容，但是仍能观察出轮廓、方位等敏感信息)，且万一IV泄露，攻击者可以用相同IV加密一篇已知明文，通过对比密文的相似性来解密。因此IV必须唯一，不可以重复使用。

如何显现敏感特征进一步说明一下比如现对一张图片进行逐行加密，每行都使用相同的IV，那根据前述性质，这张图片相同行加密出的结果应该是相同的，那最后加密结果中可以想象可能会存在相同阴影包裹着不同的阴影？二者分界面就是轮廓？不同的阴影散落在不同位置可能暗含明文布局？等信息/情况，不是说一定会泄露，还要看图片结构、实际构成等。

！还需要进一步考证，说法有待商榷。参考资料：30分钟搞定AES系列（下）：IV与加密语义安全性探究

6.2. Common Mistake: Use the Same IV

已知明文攻击(Known-plaintext Attack)

The attack used in this experiment is called the known-plaintext attack, which is an attack model for cryptanalysis where the attacker has access to both the plaintext and its encrypted version (ciphertext). If this can lead to the revealing of further secret information, the encryption scheme is not considered as secure.

1
2
3
4


Plaintext (P1): This is a known message!
Ciphertext (C1): a469b1c502c1cab966965e50425438e1bb1b5f9037a4c159
Plaintext (P2): (unknown to you)
Ciphertext (C2): bf73bcd3509299d566c35b5d450337e1bb175f903fafc159

假设采用OFB模式进行加密，加密时所采用的IV是相同的，破译P2：

先看OFB模式，参考于https://blog.csdn.net/chengqiuming/article/details/82390910

重点在于OFB模式下不是通过密码算法对明文直接加密的，而是通过将“明文分组”和“密码算法的输出”进行XOR来产生“密文分组”的。在这里我们可以忽略加密算法的细节(或者说XOR密钥流的产生过程)，重点关注XOR的双方，进行如下推导： $$ P1\oplus KEY1 = C1 $$ $$ P2 \oplus KEY2 = C2 $$

由于IV是相同的，因此经过加密算法产生的XOR密钥流是相同的即KEY1=KEY2，因此如下继续推导： $$ P1\oplus KEY = C1 $$ $$ P2 \oplus KEY = C2 $$

$$ => KEY=C1\oplus P1 $$

$$ => P2=C2\oplus KEY=C2 \oplus C1 \oplus P1 $$

由此得到破译P2的方式，已知三者进行进行异或运算即可。实验过程如下：

利用python实现三个已知字符串的异或操作

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14


#!/usr/bin/python3
# XOR two bytearrays
def xor(first, second):
	return bytearray(xˆy for x,y in zip(first, second))
MSG = "This is a known message!"
HEX_1 = "a469b1c502c1cab966965e50425438e1bb1b5f9037a4c159"
HEX_2 = "bf73bcd3509299d566c35b5d450337e1bb175f903fafc159"
# Convert ascii/hex string to bytearray
D1 = bytes(MSG, ’utf-8’)
D2 = bytearray.fromhex(HEX_1)
D3 = bytearray.fromhex(HEX_2)
r1 = xor(D1, D2)
r2 = xor(r1, D3)
print(r2.hex())

执行：python3 sample_code.py

得到P2的二进制(16进制表示)字符串。

利用Bless工具解码二进制字符串

获得明文P2：Order: Launch a missile!

6.3. Common Mistake: Use a Predictable IV

选择明文攻击：Chosen-plaintext Attack

A good cipher should not only tolerate the known-plaintext attack described previously, it should also tolerate the chosen-plaintext attack, which is an attack model for cryptanalysis where the attacker can obtain the ciphertext for an arbitrary plaintext.

=> Your job is to construct a message and ask Bob to encrypt it and give you the ciphertext. Your objective is to use this opportunity to figure out whether the actual content of Bob’s secret message is Yes or No. For this task, your are given an encryption oracle which simulates Bob and encrypts message with 128-bit AES with CBC mode.

=> 请确定密文C1对应的明文内容是Yes还是No

启动容器，Labsetup下：docker-compose up -d
接入oracle：nc 10.9.0.80 3000

根据前面介绍过的CBC模式，构造明文思路为：

设要猜测的记录$P_i$某个字段的值是$P_x$，next IV是IV，$C_{i-1}$是$P_i$前一条明文的密文，即如下构造$P_M$ $$ P_M=IV\oplus C_{i-1}\oplus P_X $$ 将其发送给服务端进行加密，CBC加密过程如下： $$ C_M = E(k,IV\oplus P_M) $$ $$ => C_M=E(k,C_{i-1}\oplus P_X) $$

比较$C_M$与$C_i$就知道猜测的明文是否正确。

参考资料：【现代密码学入门】33. CBC模式 (3)：选择IV，水很深！、对 CBC 模式的一些攻击、Initialization vector

基于此在该实验进行如下明文构造：首先不知道$C_{i-1}$，但其实也用不到他，因为题目已知明文不是yes就是no，明文有效值很短(不是上面推理时明文很长，猜测中间分组的部分的情况)，因此是在第一分组中的即是和IV异或而不是和前一个密文分组异或(因字节问题填充出来的不考虑，不影响结果yes/no)。所以明文构造： $$ P_2=IV\oplus IV’ \oplus P_x $$ 其中IV是next IV，IV‘是the IV used，$P_x$是猜测的明文值。

因为P1不是“YES”就是“NO”，因此首先猜测YES。借助openssl enc获取填充后的YES的16进制字符串。
- 新建包含Yes的文件：echo -n "Yes" > ftest.txt
- 加密：openssl enc -aes-128-cbc -e -in ftest.txt -out ftestcbc.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -p
- 解密(保留填充字符): openssl enc -aes-128-cbc -d -in ftestcbc.txt -out ftestcbcd.txt -K 00112233445566778889aabbccddeeff -iv 0102030405060708 -nopad
- 查看16进制字符串：xxd ftestcbcd.txt
所获得的16进制字符串即构造公式中的$P_x$。关键是获取Yes的16进制以及填充字符。

使用python进行IV，IV’，$P_x$三者的异或

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15


#!/usr/bin/python3
# XOR two bytearrays
def xor(first, second):
	return bytearray(xˆy for x,y in zip(first, second))

HEX_1 = "a0c95b7d5d98a4033fd545fdb0e1072f"
HEX_2 = "5cefb4cb5d98a4033fd545fdb0e1072f"
HEX_3 = "5965730d0d0d0d0d0d0d0d0d0d0d0d0d"
# Convert ascii/hex string to bytearray
D1 = bytearray.fromhex(HEX_1)
D2 = bytearray.fromhex(HEX_2)
D3 = bytearray.fromhex(HEX_3)
r1 = xor(D1, D2)
r2 = xor(r1, D3)
print(r2.hex())

运行python代码：

所得字符串即我们构造的明文$P_2$，将其输入进oracle中

加密P2并观察密文C2和C1：

因为我们输入的是16字节文件，在CBC模式下，根据填充规则，16字节会补全到32字节，因此输出也有32字节，但是cbc的加密规则下，我们只需关注前16字节即可，填充出来的16字节不影响观察结果。

可以发现前16字节密文和C1是一样的，也就是说Bob的明文正是Yes。

参考资料：Simulation of Chosen Plaintext Attack on CBC, Predictable IV with OpenSSL

Task 7: Programming using the Crypto Library

In this task, you are given a plaintext and a ciphertext, and your job is to find the key that is used for the encryption. You do know the following facts:

The aes-128-cbc cipher is used for the encryption.
The key used to encrypt this plaintext is an English word shorter than 16 characters; the word can be found from a typical English dictionary. Since the word has less than 16 characters (i.e. 128 bits), pound signs (#: hexadecimal value is 0x23) are appended to the end of the word to form a key of 128 bits.

Your goal is to write a program to find out the encryption key.

1
2
3


Plaintext (total 21 characters): This is a top secret.
Ciphertext (in hex format): 764aa26b55a4da654df6b19e4bce00f4ed05e09346fb0e762583cb7da2ac93a2
IV (in hex format): aabbccddeeff00998877665544332211

调用crypto library进行c语言编程。

思路是：遍历words.txt，依次用words.txt中的词作key和已知IV对已知明文加密，若所得密文和所给密文相同，则当前所用key即代表所要破译的key。

  1
  2
  3
  4
  5
  6
  7
  8
  9
 10
 11
 12
 13
 14
 15
 16
 17
 18
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133


#include <stdio.h>
#include <openssl/evp.h>
#include <stdlib.h>
#include <string.h>
#include <assert.h>
#include <ctype.h>

unsigned char* str2hex(char *str) {//将16进制字符串 => 16进制字符数组即计算机能理解的16进制字符串(原先计算机角度可能只是普通字符串而不是hexstring)
//举例：16进制字符串“aabbcc”实际上是2位1字节即0xaa,0xbb,0xcc,因此我们需要[0xaa,0xbb,0xcc]字符数组形式，char一个字符一字节，即2位16进制。
    unsigned char *ret = NULL;
    int str_len = strlen(str);
    int i = 0;
    assert((str_len%2) == 0);
    ret = (unsigned char *)malloc(str_len/2);
    for (i =0;i < str_len; i = i+2 ) {
        sscanf(str+i,"%2hhx",&ret[i/2]);//每两位一读取，h是指定对应数据大小为短整型，两位每位对应一个h，x指定了要被读取的数据类型以及数据读取方式为十六进制整数
    }
    return ret;
}

int strCmp(char *a,char *b){//比较加密所得密文字符串和已给密文字符串是否一致
	int d;
	for(;;a++,b++){
		d=tolower(*a)-tolower(*b);//逐位转小写并相减 => 与零相比
		if(d!=0 || !*a)//若不等于0即不相等，直接返回；若等于0即当前位相等的情况下，执行a++，b++，比较下一位，若一直等于0即一直相等，则比较至a指向空，说明每位均相等，此时d仍为0，依托!*a条件进行返回，两个字符串相等即返回d=0。
			return d;
	}
}

//do_crypt参考实验手册所给网址https://www.openssl.org/docs/man1.1.1/man3/EVP_CipherInit.html中说明及加密示例，并进行了修改
int do_crypt(char *outfile, unsigned char *key)//*outfile传入结果要写入的文件名，*key为经过填充处理的words.txt中单词所构成的密钥
 {
     unsigned char outbuf[1024+EVP_MAX_BLOCK_LENGTH];
     char givenCipher[]="764aa26b55a4da654df6b19e4bce00f4ed05e09346fb0e762583cb7da2ac93a2";
     int outlen, tmplen,flag=1;//flag用来标志本轮密钥是否符合要求，并作为函数返回值指示主函数是否已找到并结束后续循环。
     /*
      * Bogus key and IV: we'd normally set these from
      * another source.
      */
     unsigned char* iv=str2hex("aabbccddeeff00998877665544332211");
     char *intext;
     intext=(char *)malloc(21);
     memcpy(intext,"This is a top secret.",21);//写入明文
     EVP_CIPHER_CTX *ctx;
     FILE *out;

     ctx = EVP_CIPHER_CTX_new();//创建密码上下文
     EVP_EncryptInit_ex(ctx, EVP_aes_128_cbc(), NULL, key, iv);//设置密码上下文，加密算法、密钥、初始向量等
		 //加密
     if (!EVP_EncryptUpdate(ctx, outbuf, &outlen, intext, strlen(intext))) {
         /* Error */
         EVP_CIPHER_CTX_free(ctx);
         return 0;
     }
     /*
      * Buffer passed to EVP_EncryptFinal() must be after data just
      * encrypted to avoid overwriting it.
      */
     if (!EVP_EncryptFinal_ex(ctx, outbuf + outlen, &tmplen)) {
         /* Error */
         EVP_CIPHER_CTX_free(ctx);
         return 0;
     }
     outlen += tmplen;
     EVP_CIPHER_CTX_free(ctx);
     /*
      * Need binary mode for fopen because encrypted data is
      * binary data. Also cannot use strlen() on it because
      * it won't be NUL terminated and may contain embedded
      * NULs.
      */
     out = fopen(outfile, "wb");
     if (out == NULL) {
         /* Error */
         return 0;
     }
     int i;
     char *buf_str=(char*)malloc(2*outlen+1);
     char *buf_ptr=buf_str;
     for(i=0;i<outlen;i++){
     	buf_ptr+=sprintf(buf_ptr,"%02X",outbuf[i]);
     }
     *(buf_ptr+1)='\0';//将加密后的密文以16进制形式另存，头指针buf_str
     //printf("%s",givenCipher);
     if(strCmp(givenCipher,buf_str)==0){//利用strCmp函数比较givenCipher和buf_str，即比较当前密钥加密后的密文与原密文，若相等即返回值为0，则将其写入文件
     	 for(i=0;i<strlen(key);i++)
     	 		fprintf(out,"%c",key[i]);//写入密钥
     	 fprintf(out,"%c",' ');
	     for(i=0;i<outlen;i++)
	     		fprintf(out,"%02x",outbuf[i]);//写入密文
	     flag=2;//标志位置为2代表本次密钥符合要求
     }
     fclose(out);
     return flag;//返回flag，1本轮密钥不符合，2为符合
 }

 void padding(char *s,int length){//密钥填充函数，s是words.txt中单词字符串，length是要填充到的长度
 	int l;
 	l=strlen(s);
 	while(l<length){
 		s[l]='#';//填充#字符
 		l++;
 	}
 	s[l]='\0';
 	//printf("%s\n",s);
 }



 int main(){
 	char words[16],*outfile="task7decrypt.txt";
 	FILE *key,*out;
 	int i,j;
 	if((key=fopen("words.txt","r"))==NULL){
 		printf("Open Error!");
 		return -1;
 	}//打开words.txt
 	while(!feof(key)){
 		fgets(words,16,key);//每次读入一个单词
 		i=strlen(words);
 		words[i-1]='\0';//在text editor中自动添加了null或者文件结束符等，因此在这里去除
 		i=strlen(words);
 		if(i<16)//AES-128要求密钥长度是16字节
 			padding(words,16);
 		printf("Test key:%s\n",words);
 		j=do_crypt(outfile,words);
 		if(j==2)//do_crypt()返回值为2说明已匹配到正确的密钥，break结束循环
 			break;
 	}
 	printf("The discovered key has already been saved in File task7decrypt.txt.\n");
 	fclose(key);
 	return 0;
 }

编译及运行：gcc -o task7 task7.c -lcrypto

运行：./task7

查看key：

从而找到使用的密钥Key：Syracuse

参考资料：

Secret-Key Encryption Lab网安实验

Crypto Lab – Secret-Key Encryption (Part 2)

EVP_EncryptInit

converting a hex string array to hex array (duplicate)

C 库函数 - sscanf()