by swass » Sat Sep 05, 2015 1:14 am
Moonman, we're golden. It works. There does appear to be a problem with something else that is causing openssl, for example, to skip using the hardware crypto for aes-ecb, but it works perfectly fine for aes-cbc. Not sure why at the moment. This is clear from examining the interrupts associated with the hardware crypto. On the other hand, it does seem that LUKS is using hardware crypto with aes-ecb. So this looks to be something wrong with OpenSSL (using the cryptodev version, but same without).
marvell_cesa without DMA:
$this->bbcode_second_pass_code('', '# openssl speed -evp aes-128-cbc -engine cryptodev -elapsed
engine "cryptodev" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 45004 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 41091 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 37630 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 25995 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 5344 aes-128-cbc's in 3.00s
OpenSSL 1.0.2d 9 Jul 2015
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DHAVE_CRYPTODEV -DHASH_MAX_LEN=64 -Wa,--noexecstack -D_FORTIFY_SOURCE=2 -march=armv5te -O2 -pipe -fstack-protector --param=ssp-buffer-size=4 -Wl,-O1,--sort-common,--as-needed,-z,relro -O3 -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 240.02k 876.61k 3211.09k 8872.96k 14592.68k')
marvell_cesa with DMA:
$this->bbcode_second_pass_code('', '# openssl speed -evp aes-128-cbc -engine cryptodev -elapsed
engine "cryptodev" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-cbc for 3s on 16 size blocks: 43172 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 64 size blocks: 40990 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 256 size blocks: 40796 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 1024 size blocks: 32189 aes-128-cbc's in 3.00s
Doing aes-128-cbc for 3s on 8192 size blocks: 10079 aes-128-cbc's in 3.00s
OpenSSL 1.0.2d 9 Jul 2015
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DHAVE_CRYPTODEV -DHASH_MAX_LEN=64 -Wa,--noexecstack -D_FORTIFY_SOURCE=2 -march=armv5te -O2 -pipe -fstack-protector --param=ssp-buffer-size=4 -Wl,-O1,--sort-common,--as-needed,-z,relro -O3 -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-cbc 230.25k 874.45k 3481.26k 10987.18k 27522.39k')
marvell_cesa with DMA (other combinations):
$this->bbcode_second_pass_code('', '# openssl speed -evp aes-128-ecb -engine cryptodev -elapsed
engine "cryptodev" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-128-ecb for 3s on 16 size blocks: 2066134 aes-128-ecb's in 3.00s
Doing aes-128-ecb for 3s on 64 size blocks: 557144 aes-128-ecb's in 3.00s
Doing aes-128-ecb for 3s on 256 size blocks: 142277 aes-128-ecb's in 3.00s
Doing aes-128-ecb for 3s on 1024 size blocks: 35786 aes-128-ecb's in 3.00s
Doing aes-128-ecb for 3s on 8192 size blocks: 4477 aes-128-ecb's in 3.00s
OpenSSL 1.0.2d 9 Jul 2015
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DHAVE_CRYPTODEV -DHASH_MAX_LEN=64 -Wa,--noexecstack -D_FORTIFY_SOURCE=2 -march=armv5te -O2 -pipe -fstack-protector --param=ssp-buffer-size=4 -Wl,-O1,--sort-common,--as-needed,-z,relro -O3 -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-128-ecb 11019.38k 11885.74k 12140.97k 12214.95k 12225.19k
# openssl speed -evp aes-256-ecb -engine cryptodev -elapsed
engine "cryptodev" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-ecb for 3s on 16 size blocks: 1575155 aes-256-ecb's in 3.00s
Doing aes-256-ecb for 3s on 64 size blocks: 416983 aes-256-ecb's in 3.00s
Doing aes-256-ecb for 3s on 256 size blocks: 105958 aes-256-ecb's in 3.00s
Doing aes-256-ecb for 3s on 1024 size blocks: 26600 aes-256-ecb's in 3.00s
Doing aes-256-ecb for 3s on 8192 size blocks: 3323 aes-256-ecb's in 3.00s
OpenSSL 1.0.2d 9 Jul 2015
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DHAVE_CRYPTODEV -DHASH_MAX_LEN=64 -Wa,--noexecstack -D_FORTIFY_SOURCE=2 -march=armv5te -O2 -pipe -fstack-protector --param=ssp-buffer-size=4 -Wl,-O1,--sort-common,--as-needed,-z,relro -O3 -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-ecb 8400.83k 8895.64k 9041.75k 9079.47k 9074.01k
# openssl speed -evp aes-256-cbc -engine cryptodev -elapsed
engine "cryptodev" set.
You have chosen to measure elapsed time instead of user CPU time.
Doing aes-256-cbc for 3s on 16 size blocks: 41799 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 64 size blocks: 39894 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 256 size blocks: 38471 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 1024 size blocks: 29770 aes-256-cbc's in 3.00s
Doing aes-256-cbc for 3s on 8192 size blocks: 7218 aes-256-cbc's in 3.00s
OpenSSL 1.0.2d 9 Jul 2015
built on: reproducible build, date unspecified
options:bn(64,32) rc4(ptr,char) des(idx,cisc,16,long) aes(partial) idea(int) blowfish(ptr)
compiler: gcc -I. -I.. -I../include -fPIC -DOPENSSL_PIC -DZLIB -DOPENSSL_THREADS -D_REENTRANT -DDSO_DLFCN -DHAVE_DLFCN_H -DHAVE_CRYPTODEV -DHASH_MAX_LEN=64 -Wa,--noexecstack -D_FORTIFY_SOURCE=2 -march=armv5te -O2 -pipe -fstack-protector --param=ssp-buffer-size=4 -Wl,-O1,--sort-common,--as-needed,-z,relro -O3 -Wall -DOPENSSL_BN_ASM_MONT -DOPENSSL_BN_ASM_GF2m -DSHA1_ASM -DSHA256_ASM -DSHA512_ASM -DAES_ASM -DBSAES_ASM -DGHASH_ASM
The 'numbers' are in 1000s of bytes per second processed.
type 16 bytes 64 bytes 256 bytes 1024 bytes 8192 bytes
aes-256-cbc 222.93k 851.07k 3282.86k 10161.49k 19709.95k')
LUKS without DMA:
$this->bbcode_second_pass_code('', ' cipher: aes-ecb
keysize: 256 bits
')
Write - O_DIRECT
$this->bbcode_second_pass_code('', '# dd if=/dev/zero of=./bigfile count=1024 bs=1M oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 116.527 s, 9.2 MB/s')
Read - O_DIRECT
$this->bbcode_second_pass_code('', '# dd of=/dev/null if=./bigfile count=1024 bs=1M iflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 110.617 s, 9.7 MB/s')
LUKS with DMA:
Write - O_DIRECT
$this->bbcode_second_pass_code('', '# dd if=/dev/zero of=./bigfile count=1024 bs=1M oflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 80.6936 s, 13.3 MB/s')
Read - O_DIRECT
$this->bbcode_second_pass_code('', '# dd of=/dev/null if=./bigfile count=1024 bs=1M iflag=direct
1024+0 records in
1024+0 records out
1073741824 bytes (1.1 GB) copied, 78.4823 s, 13.7 MB/s')
So, an improvement to be sure, but not an enormous one yet. I'll work on optimizing the block sizes and changing from AES 256 to AES 128, which is still quite good.
Last edited by
swass on Sat Sep 05, 2015 2:12 am, edited 1 time in total.